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GENES ENCODING OLFACTORY RECEPTORS AND BIALLELIC MARKERS 

THEREOF 

FIELD OF THE INVENTION 

The present invention pertains to a purified or isolated nucleic acid comprising ten open 
5 reading Frames (ORFs) encoding ten different olfactory receptor-like proteins, non-coding regions 
flanking the ORFs as well as fragments thereof. The invention also provides recombinant expression 
vectors and recombinant cell hosts containing a nucleic acid encoding said olfactory receptor 
proteins. The invention also concerns the olfactory receptor proteins encoded by these ORFs as well 
as polypeptides that are homologous to said olfactory receptor proteins and the peptide fragments of 

10 both the olfactory receptor proteins and their homologous polypeptide counterparts. The invention 
also deals with antibodies directed specifically against such polypeptides that are useful as 
diagnostic reagents. The invention further encompasses biallelic markers of the olfactory receptor 
gene useful in genetic analysis. The invention also deals with methods and kits for the detection of 
the olfactory receptor proteins and with methods and kits for screening ligand molecules binding to 

15 these proteins. 

BACKGROUND OF THE INVENTION 

Throughout this application, various bibliographic publications are cited. Full bibliographic 
references for these publications may be found at the end of this application, preceding the sequence 
listing and the claims. 

20 OLFACTORY SYSTEM 

The olfactory receptor cells, the first cells in the pathway that give rise to the sense of smell, 
lie in a small patch of membrane, the olfactory epithelium, in the upper part of the nasal cavity. 
These cells are specialized afferent neurons that have an enlarged extension analogous to a dendrite. 
Several long hairlike processes extend out from this extension along the surface of the olfactory 
25 epithelium where they are bathed in mucus. The hairlike processes contain the receptor proteins for 
olfactory stimuli. The axons of these neurons form the olfactory nerve. 

For the detection of an odorous substance which is called an odorant, molecules of the 
substance must first diffuse into the air and pass into the nose to the region of the olfactory 
epithelium. Once there, they dissolve in the mucus that covers the epithelium and then bind to 
30 specific receptor proteins on the cilia. 

Although there are many thousands of olfactory neurons, each contains one, or at most a 
few, of the 1,000 or so different receptor types, each of which responds only to a specific chemically 
related group of odorant molecules. Each odorant has characteristic chemical groups that distinguish 
it from other odorants, and each of these groups activates a different receptor type. Thus the identity 
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of a particular odorant is determined by the activation of a precise combination of receptors, each of 
which is contained in a distinct group of olfactory neurons. 

The axons of the olfactory neurons synapse in the brain structures known as olfactory bulbs, 
which lie on the undersurface of the frontal lobes. Axons from olfactory neurons sharing a common 
5 receptor specificity synapse together on certain olfactory-bulb neurons, thereby maintaining the 
specificity of the original stimuli. 

OLFACTORY RECEPTORS 

In contrast with the immunoglobulin system, the diversity of olfactory receptors is encoded 
by a large germ-line repertoire of olfactory receptor genes. The size of the olfactory receptor gene 

10 family in the human genome is unknown but it has been estimated to encompass 200 to 1 ,000 genes. 

The locations of only a few human genes have been determined to date. The picture that has 
emerged so far is that several large clusters of olfactory genes and pseudogenes span hundreds of 
kilobases on several chromosomes. Using FISH analyses, more than 25 distinct locations of 
olfactory receptors gene have been identified in the human genome. 

15 In mammals, the olfactory epithelium appears to be organized into distinct topographic 

regions or zones in which expression of a particular receptor gene appears to be restricted to one of 
the four zones in the epithelium. Within the zone, the distribution of neurons expressing a given 
receptor is random. Chromosomal mapping studies have revealed clusters of odorant receptor genes 
at a single locus, and numerous such loci have been mapped to different chromosomes. However, 

20 receptors expressed in the same zone map to different loci, and a single locus can contain genes 
expressed in different zones. A putative odorant receptor promoter, consisting of the 6.7 kb DNA 
fragment upstream of the receptor coding region, has been shown to be sufficient to direct olfactory 
receptor expression in a tissue-specific, zonal-specific manner. 

Olfactory receptors share a seven-transmembrane domain structure (TMl to TM7) with 

25 many neurotransmitter and hormone receptors. They show a high degree of sequence similarity in 
some conserved domains (TM2 and TM7) as well as regions of diversity (TM3, TM4, TM5, and 
TM6). They are responsible for the recognition and G protein-mediated transduction of odorant 
signals. The genes encoding these receptors are devoid of introns within their coding regions. 

Olfactory receptors display all hallmarks of the G-protein coupled receptor superfamily but 

30 have also some unique motifs. Most notably they appear to be minimal in structure with very short 
cytoplasmic and extracellular loops. In addition, they display a striking structural diversity in the 
Ihird, fourth and fifth transmembrane domains which are supposed to form the hydrophobic core of 
these proteins, and may form the ligand binding site of the receptors. 

An understanding of the genetic basis of olfaction and a knowledge of olfactory receptors 

35 are important to enable the design of fragrance, the identification of compounds which control 
appetite, or the detection of compounds which can be harmful or dangerous. 
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SUMMARY OF THE INVENTION 

This invention provides a nucleic acid molecule encoding ten different olfactory receptor- 
like proteins (OLF). 

The invention also deals with a nucleic acid molecule comprising a nucleotide sequence 
5 encoding an olfactory receptor-like protein, which nucleotide sequence is selected from the group 
consisting of SEQ ID Nos 2-1 1, as well as with the corresponding polypeptide encoded by this 
nucleotide sequence and with antibodies directed against the corresponding polypeptide. 

Oligonucleotide probes or primers hybridizing specifically with an olfactory receptor 
genomic sequence are also part of the present invention, as well as DNA amplification and detection 
10 methods using said primers and probes. 

The invention also concerns a purified and/or isolated biallelic marker located in the 
sequence of the olfactory receptor gene cluster of the invention, wherein said biallelic marker is 
useful as a diagnostic tool in order to detect an allele associated with a specific phenotype as regards 
to the olfaction system, including an alteration of the olfactory perception of substances or 
15 molecules. 

A further object of the invention consists of recombinant vectors comprising any of the 
nucleic acid sequences described above, and in particular of recombinant vectors comprising a 
sequence encoding an olfactory receptor protein, as well as of cell hosts and transgenic non human 
animals comprising said nucleic acid sequences or recombinant vectors. 
20 A further object of the invention consists of methods for screening substances or molecules 

interacting with an olfactory receptor encoded by any of the nucleic acid molecule described above. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 : Alignment of the amino acid sequences of the olfactory polypeptides encoded by 
25 the Open Reading Frames of the olfactory receptor gene cluster of the invention. The lower line 

represents the consensus sequence. The locations of the seven transmembrane domains TMl to TM7 
are boxed. 

BRIEF DESCRIPTION OF THE SEQUENCES PROVIDED IN THE SEQUENCE 

LISTING 

30 SEQ ID No 1 contains the olfactory receptor genomic sequence. 

SEQ ID Nos 2-1 1 contains the nucleotide sequences of the open reading fi-ame sequences of 
SEQ ID No i encoding the OLFl to OLF 10 polypeptides. 

SEQ ID No 12-21 contain the amino acid sequence of OLFl to OLF 10 polypeptides 
encoded by the open reading frames of SEQ ID Nos 2-11. 
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SEQ ED Nos 22-25 contain the amplification primers used for FISH experiments described 
in Example 1 . 

SEQ ID No 26 contains a primer containing the additional PU 5' sequence described further 
in Example 3. 

5 SEQ ID No 27 contains a primer containing the additional RP 5' sequence described further 

in Example 3. 

In accordance with the regulations relating to Sequence Listings, the following codes have 
been used in the Sequence Listing to indicate the locations of biallelic markers within the sequences 
and to identify each of the alleles present at the polymorphic base. The code "r" in the sequences 

10 indicates that one allele of the polymorphic base is a guanine, while the other allele is an adenine. 
The code "y" in the sequences indicates that one allele of the polymorphic base is a thymine, while 
the other allele is a cytosine. The code "m" in the sequences indicates that one allele of the 
polymorphic base is an adenine, while the other allele is an cytosine. The code "k" in the sequences 
indicates that one allele of the polymorphic base is a guanine, while the other allele is a thymine. 

15 The code "s" in the sequences indicates that one allele of the polymorphic base is a guanine, while 
the other allele is a cytosine. The code "w" in the sequences indicates that one allele of the 
polymorphic base is an adenine, while the other allele is an thymine. 

The nucleotide code of the original allele for each biallelic marker is the following: 



Biallelic marker Original allele 

20 99-13670-305 G 

99-13669-471 G 

99-13666-275 A 

99-13664-221 T 

99-13663-218 G 

25 99-13660-277 C 

99-13652-407 G 

99-13652-357 A 

99-13652-308 A 

99-13671-396 A 

30 99-13649-286 C 

99-13648-259 G 

99-13647-278 G 



DETAILED DESCRIPTION OF THE INVENTION 

35 The aim of the present invention is to provide polynucleotides and polypeptides related to 

novel olfactory receptors, notably useful in order to design suitable means for detecting specific 
odorant molecules in a material sample, particularly in a material sample suspected to contain an 
odorant molecule that consists of one of the specific ligands for the olfactory receptors of the 
invention. 
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DEFINITIONS 

Before describing the invention in greater detail, the following definitions are set forth to 
illustrate and define the meaning and scope of the terms used to describe the invention herein. 

General definitions 

5 The terms '' olfactory receptor gene '' or " OLFl to QLFIQ " genes, when used herein, 

encompasses genomic, mRNA and cDNA sequences encoding the OLFl to OLFIO olfactory 
receptor proteins. 

The term ' 'heterologous protein ", when used herein, is intended to designate any protein or 
polypeptide other than the OLFl to OLFIO proteins. 

10 The term " isolated " requires that the material be removed from its original environment 

(e.g., the natural environment if it is naturally occurring). For example, a naturally-occurring 
polynucleotide or polypeptide present in a living animal is not isolated, but the same polynucleotide 
or DNA or polypeptide, separated from some or all of the coexisting materials in the natural system, 
is isolated. Such polynucleotide could be part of a vector and/or such polynucleotide or polypeptide 

15 could be part of a composition, and still be isolated in that the vector or composition is not part of its 
natural environment. 

The term " purified " does not require absolute purity; rather, it is intended as a relative 
definition. Purification of starting material or natural material to at least one order of magnitude, 
preferably two or three orders, and more preferably four or five orders of magnitude is expressly 

20 contemplated. As an example, purification from 0.1 % concentration to 10 % concentration is two 
orders of magnitude. The term "purified polynucleotide'' is used herein to describe a polynucleotide 
or polynucleotide vector of the invention which has been separated from other compounds including, 
but not limited to other nucleic acids, carbohydrates, lipids and proteins (such as the enzymes used 
in the synthesis of the polynucleotide), or the separation of covalently closed polynucleotides from 

25 linear polynucleotides. A polynucleotide is substantially pure when at least about 50%, preferably 
60 to 75% of a sample exhibits a single polynucleotide sequence and conformation (linear versus 
covalently close). A substantially pure polynucleotide typically comprises about 50%, preferably 60 
to 90% weight/weight of a nucleic acid sample, more usually about 95%, and preferably is over 
about 99% pure. Polynucleotide purity or homogeneity is indicated by a number of means well 

30 known in the art, such as agarose or polyacrylamide gel electrophoresis of a sample, followed by 
visualizing a single polynucleotide band upon staining the gel. For certain purposes higher 
resolution can be provided by using HPLC or other means well known in the art. 

The term " polvpeptide *' refers to a polymer of amino acids without regard to the length of 
the polymer; thus, peptides, oligopeptides, and proteins are included within the definition of 

35 polypeptide. This term also does not specify or exclude post-expression modifications of 

polypeptides, for example, polypeptides which include the covalent attachment of glycosyl groups, 
acetyl groups, phosphate groups, lipid groups and the like are expressly encompassed by the term 
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polypeptide. Also included within the definition are polypeptides which contain one or more 
analogs of an amino acid (including, for example, non-naturally occurring amino acids, amino acids 
which only occur naturally in an unrelated biological system, modified amino acids from 
mammalian systems etc.), polypeptides with substituted linkages, as well as other modifications 
5 known in the art, both naturally occurring and non-naturally occurring. 

The term " recombinant polypeptide " is used herein to refer to polypeptides that have been 
artificially designed and which comprise at least two polypeptide sequences that are not found as 
contiguous polypeptide sequences in their initial natural environment, or to refer to polypeptides 
which have been expressed from a recombinant polynucleotide. 

10 The term '' purified polypeptide "' is used herein to describe a polypeptide of the invention 

which has been separated from other compounds including, but not limited to nucleic acids, lipids, 
carbohydrates and other proteins. A polypeptide is substantially pure when at least about 50%, 
preferably 60 to 75% of a sample exhibits a single polypeptide sequence. A substantially pure 
polypeptide typically comprises about 50%, preferably 60 to 90% weight/weight of a protein sample, 

15 more usually about 95%, and preferably is over about 99% pure. Polypeptide purity or homogeneity 
is indicated by a number of means well knovm in the art, such as polyacrylamide gel electrophoresis 
of a sample, followed by visualizing a single polypeptide band upon staining the gel. For certain 
purposes higher resolution can be provided by using HPLC or other means well known in the art. 

As used herein, the term '' non-human animal " refers to any non-human vertebrate, birds and 

20 more usually mammals, preferably primates, farm animals such as swme, goats, sheep, donkeys, and 
horses, rabbits or rodents, more preferably rats or mice. As used herein, the term "animal" is used to 
refer to any vertebrate, preferable a mammal. Both the terms "animaF' and "mammal" expressly 
embrace human subjects unless preceded with the term "non-human". 

As used herein, the term " antibody " refers to a polypeptide or group of polypeptides which 

25 are comprised of at least one binding domain, where an antibody binding domain is formed from the 
folding of variable domains of an antibody molecule to form three-dimensional binding spaces with 
an internal surface shape and charge distribution complementary to the features of an antigenic 
determinant of an antigen, which allows an immunological reaction with the antigen. Antibodies 
include recombinant proteins comprising the binding domains, as wells as fragments, including Fab, 

30 Fab', F(ab)2, and F(ab')2 fragments. 

As used herein, an " antigenic determinant " is the portion of an antigen molecule, in this case 
a OLFl to OLFIO polypeptide, that determines the specificity of the antigen-antibody reaction. An 
"epitope" refers to an antigenic determinant of a polypeptide. An epitope can comprise as few as 3 
amino acids in a spatial conformation which is unique to the epitope. Generally an epitope 

35 comprises at least 6 such amino acids, and more usually at least 8-10 such amino acids. Methods for 
determining the amino acids which make up an epitope include x-ray crystallography, 2-dimensional 
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nuclear magnetic resonance, and epitope mapping e.g. the Pepscan method described by Geysen et 
al. 1984; PCT Pubhcation No. WO 84/03564; and PCX Pubhcation No. WO 84/03506. 

Throughout the present specification, the expression " nucleotide sequence " may be 
employed to designate indifferently a polynucleotide or a nucleic acid. More precisely, the 
5 expression "nucleotide sequence" encompasses the nucleic material itself and is thus not restricted to 
the sequence information (i.e. the succession of letters chosen among the four base letters) that 
biochemically characterizes a specific DNA or RNA molecule. 

As used interchangeably herein, the terms " nucleic acids ", "oligonucleotides", and 
"polynucleotides" include RNA, DNA, or RNA/DNA hybrid sequences of more than one nucleotide 

10 in either single chain or duplex form. The term "nucleotide" as used herein as an adjective to 

describe molecules comprising RNA, DNA, or RNA/DNA hybrid sequences of any length in single- 
stranded or duplex form. The term "nucleotide" is also used herein as a noun to refer to individual 
nucleotides or varieties of nucleotides, meaning a molecule, or individual unit in a larger nucleic 
acid molecule, comprising a purine or pyrimidine, a ribose or deoxyribose sugar moiety, and a 

15 phosphate group, or phosphodiester linkage in the case of nucleotides within an oligonucleotide or 
polynucleotide. The term "nucleotide" is also used herein to encompass "modified nucleotides" 
which comprise at least one modifications (a) an alternative linking group, (b) an analogous form of 
purine, (c) an analogous form of pyrimidine, or (d) an analogous sugar, for examples of analogous 
linking groups, purine, pyrimidines, and sugars see for example PCT publication No. WO 95/04064. 

20 The polynucleotide sequences of the invention may be prepared by any known method, including 
synthetic, recombinant, ex vivo generation, or a combination thereof, as well as utilizing any 
purification methods known in the art. 

A " promoter " refers to a DNA sequence recognized by the synthetic machinery of the cell 
required to initiate the specific transcription of a gene. 

25 A sequence which is " operablv linked " to a regulatory sequence such as a promoter means 

that said regulatory element is in the correct location and orientation in relation to the nucleic acid to 
control RNA polymerase initiation and expression of the nucleic acid of interest. As used herein, the 
term "operably linked" refers to a linkage of polynucleotide elements in a functional relationship. 
For instance, a promoter or enhancer is operably linked to a coding sequence if it affects the 

30 transcription of the coding sequence. More precisely, two DNA molecules (such as a polynucleotide 
containing a promoter region and a polynucleotide encoding a desired polypeptide or 
polynucleotide) are said to be "operably linked" if the nature of the linkage between the two 
polynucleotides does not (1) result in the introduction of a frame-shift mutation or (2) interfere with 
the ability of the polynucleotide containing the promoter to direct the transcription of the coding 

35 polynucleotide. 

The term "vector " is used herein to designate either a circular or a linear DNA or RNA 
molecule, which is either double-stranded or single-stranded, and which comprise at least one 
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polynucleotide of interest that is sought to be transferred in a cell host or in a unicellular or 
multicellular host organism. 

The term '' primer " denotes a specific oligonucleotide sequence which is complementary to a 
target nucleotide sequence and used to hybridize to the target nucleotide sequence. A primer serves 
5 as an initiation point for nucleotide polymerization catalyzed by either DNA polymerase, RNA 
polymerase or reverse transcriptase. 

The term "probe" denotes a defined nucleic acid segment (or nucleotide analog segment, 
e.g., polynucleotide as defined hereinbelow) which can be used to identify a specific polynucleotide 
sequence present in samples, said nucleic acid segment comprising a nucleotide sequence 
10 complementary of the specific polynucleotide sequence to be identified. 

The terms "trait" and "phenotype" are used interchangeably herein and refer to any visible, 
detectable or otherwise measurable property of an organism such as symptoms of, or susceptibility 
to a disease for example. 

The term " allele " is used herein to refer to variants of a nucleotide sequence. A biallelic 
15 polymorphism has two forms. Diploid organisms may be homozygous or heterozygous for an allelic 
form. 

The term " genotype " as used herein refers the identity of the alleles present in an individual 
or a sample. In the context of the present invention, a genotype preferably refers to the description 
of the biallelic marker alleles present in an individual or a sample. The term "genotyping" a sample 

20 or an individual for a biallelic marker involves determining the specific allele or the specific 
nucleotide carried by an individual at a biallelic marker. 

The term "' mutation " as used herein refers to a difference in DNA sequence between or 
among different genomes or individuals which has a frequency below 1%. 

The term " polymorphism " as used herein refers to the occurrence of two or more alternative 

25 genomic sequences or alleles between or among different genomes or individuals. "Polymorphic" 
refers to the condition in which two or more variants of a specific genomic sequence can be found in 
a population. A "polymorphic site" is the locus at which the variation occurs. A single nucleotide 
polymorphism is the replacement of one nucleotide by another nucleotide at the polymorphic site. 
Deletion of a single nucleotide or insertion of a single nucleotide also gives rise to single nucleotide 

30 polymorphisms. In the context of the present invention, "single nucleotide polymorphism" 

preferably refers to a single nucleotide substitution. Typically, between different individuals, the 
polymorphic site may be occupied by two different nucleotides. 

The term " biallelic polymorphism " and " biallelic marker " are used interchangeably herein to 
refer to a single nucleotide polymorphism having two alleles at a fairly high fi-equency in the 

35 population. A "biallelic marker allele" refers to the nucleotide variants present at a biallelic marker 
site. 
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The location of nucleotides in a polynucleotide with respect to the center of the 
polynucleotide are described herein in the following manner. When a polynucleotide has an odd 
number of nucleotides, the nucleotide at an equal distance from the 3' and 5' ends of the 
polynucleotide is considered to be '' at the center '' of the polynucleotide, and any nucleotide 
5 immediately adjacent to the nucleotide at the center, or the nucleotide at the center itself is 
considered to be "within 1 nucleotide of the center." With an odd number of nucleotides in a 
polynucleotide any of the five nucleotides positions in the middle of the polynucleotide would be 
considered to be within 2 nucleotides of the center, and so on. When a polynucleotide has an even 
number of nucleotides, there would be a bond and not a nucleotide at the center of the 

10 polynucleotide. Thus, either of the two central nucleotides would be considered to be "within 1 
nucleotide of the center" and any of the four nucleotides in the middle of the polynucleotide would 
be considered to be "within 2 nucleotides of the center", and so on. 

Biallelic markers can be defined as genome-derived polynucleotides having between 2 and 
100, preferably between 20, 30, or 40 and 60, and more preferably about 47 nucleotides in length, 

15 which exhibit biallelic polymorphism at one single base position. Each biallelic marker therefore 
corresponds to two forms of a polynucleotide sequence included in a gene which, when compared 
with one another, present a nucleotide modification at one position. 

The term " upstream " is used herein to refer to a location which is toward the 5* end of the 
polynucleotide from a specific reference point. 

20 The terms " base paired " and "Watson & Crick base paired" are used interchangeably herein 

to refer to nucleotides which can be hydrogen bonded to one another be virtue of their sequence 
identities in a manner like that found in double-helical DN A with thymine or uracil residues linked 
to adenine residues by two hydrogen bonds and cytosine and guanine residues linked by three 
hydrogen bonds (See Stryer, L., Biochemistry^, 4*^ edition, 1995). 

25 The terms " complementary " or "complement thereof are used herein to refer to the 

sequences of polynucleotides which is capable of forming Watson & Crick base pairing with another 
specified polynucleotide throughout the entirety of the complementary region. For the purpose of the 
present invention, a first polynucleotide is deemed to be complementary to a second polynucleotide 
when each base in the first pol>nucleotide is paired with its complementary base. Complementary 

30 bases are, generally, A and T (or A and U), or C and G. "Complement" is used herein as a synonym 
from "complementary polynucleotide", "complementary nucleic acid" and "complementary 
nucleotide sequence". These terms are applied to pairs of polynucleotides based solely upon their 
sequences and not any particular set of conditions under which the two polynucleotides would 
actually bind. 
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1- Polynucleotides 

The invention also relates to variants and fragments of the polynucleotides described herein, 
particularly of an olfactory receptor gene containing one or more biallelic markers according to the 
5 invention. 

Variants of polynucleotides, as the term is used herein, are polynucleotides that differ from a 
reference polynucleotide. A variant of a polynucleotide may be a naturally occurring variant such as 
a naturally occurring allelic variant, or it may be a variant that is not known to occur naturally. Such 
non-naturally occurring variants of the polynucleotide may be made by mutagenesis techniques, 
10 including those applied to polynucleotides, cells or organisms. Generally, differences are limited so 
that the nucleotide sequences of the reference and the variant are closely similar overall and, in many 
regions, identical. 

Variants of polynucleotides according to the invention include, without being limited to, 
nucleotide sequences at least 95% identical to a nucleic acid selected from the group consisting of 

15 SEQ ID Nos 1-1 1, or to any polynucleotide fragment of at least 12 consecutive nucleotides from a 
nucleic acid selected from the group consisting of SEQ ID Nos 1-11, and preferably at least 99% 
identical, more particularly at least 99.5% identical, and most preferably at least 99.8% identical to a 
nucleic acid selected from the group consisting of SEQ ID Nos 1-1 1, or to any polynucleotide 
fragment of at least 12 consecutive nucleotides from a nucleic acid selected from the group 

20 consisting of SEQ ID Nos 1-11. 

Changes in the nucleotide of a variant may be silent, which means that they do not alter the 
amino acids encoded by the polynucleotide. However, nucleotide changes may also result in amino 
acid substitutions, additions, deletions, fusions and truncations in the polypeptide encoded by the 
reference sequence. The substitutions, deletions or additions may involve one or more nucleotides. 

25 The variants may be altered in coding or non-coding regions or both. Alterations in the coding 
regions may produce conservative or non-conservative amino acid substitutions, deletions or 
additions. 

In the context of the present invention, particularly preferred embodiments are those in 
which the polynucleotides encode polypeptides which retain substantially the same biological 

30 function or activity as the mature olfactory receptor protein, or those in which the polynucleotides 
encode polypeptides which maintain or increase a particular biological activity, while reducing a 
second biological activity. 

A polynucleotide fragment is a polynucleotide which sequence is fully comprised within 
part of a given nucleotide sequence, preferably the nucleotide sequence of an olfactory receptor gene 

35 of the invention, and variants thereof. The fragment can be a portion of a coding or non-coding 

region of the olfactory receptor gene cluster. Preferably, such fragments comprise at least one of the 
biallelic markers A 1 to Al 3 or the complements thereto or a biallelic marker in linkage 
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disequilibrium with one or more of the biallehc markers Al to A 13, for which the respective 
locations in the sequence listing are provided in Table 2. 

Such fragments may be "free-standing", i.e. not part of or fused to other polynucleotides, or 
they may be comprised within a single larger polynucleotide of which they form a part or region. 
5 However, several fragments may be comprised within a single larger polynucleotide. 

As representative examples of polynucleotide fragments of the invention, there may be 
mentioned those which have from about 4, 6, 8, 15, 20, 25, 40, 10 to 30, 30 to 55, 50 to 100, 75 to 
100 or 100 to 200 nucleotides in length. Preferred are those fragments having about 47 nucleotides 
in length, such as those comprising at least one of the biallelic markers Al to Al 3 of the olfactory 
10 receptor gene. Optionally, such fragments may consist of, or consist essentially of a contiguous span 
of at least 8, 10, 12, 15, 18, 20, 25, 35, 40, 50, 70, 80, 100, 250, 500 or 1000 nucleotides m length. 
A set of preferred fragments contain at least one of the biallelic markers Al to A 13 of the olfactory 
receptor gene which are described herein or the complements thereto. 

2- Polypeptides 

15 The invention also relates to variants, fragments, analogs and derivatives of the polypeptides 

described herein, including mutated olfactory receptor proteins. 

The variant may be 1) one in which one or more of the amino acid residues are substituted 
with a conserved or non-conserved amino acid residue and such substituted amino acid residue may 
or may not be one encoded by the genetic code, or 2) one in which one or more of the amino acid 

20 residues includes a substituent group, or 3) one in which the mutated olfactory receptor is fused with 
another compound, such as a compound to increase the half-life of the polypeptide (for example, 
polyethylene glycol), or 4) one in which the additional amino acids are fused to the mutated 
olfactory receptor, such as a leader or secretory sequence or a sequence which is employed for 
purification of the mutated olfactory receptor or a preprotein sequence. Such variants are deemed to 

25 be within the scope of those skilled in the art. 

In the case of an amino acid substitution in the amino acid sequence of a pol3q3eptide 
according to the invention, one or several amino acids can be replaced by "equivalent" amino acids. 
The expression "equivalent" amino acid is used herein to designate any amino acid that may be 
substituted for one of the amino acids having similar properties, such that one skilled in the art of 

30 peptide chemistry would expect the secondary structure and hydropathic nature of the polypeptide to 
be substantially unchanged. Generally, the following groups of amino acids represent equivalent 
changes: (1) Ala, Pro, Gly, Glu, Asp, Gin, Asn, Ser, Thr; (2) Cys, Ser, Tyr, Thr; (3) Val, He, Leu, 
Met,Ala,Phe; (4) Lys, Arg, His; (5) Phe, Tyr, Trp, His. 

More particularly, a variant olfactory receptor polypeptide comprises amino acid changes 

35 ranging from 1, 2, 3, 4, 5, 10 to 20 substitutions, additions or deletions of one aminoacid, preferably 
from 1 to 10, more preferably from 1 to 5 and most preferably from 1 to 3 substitutions, additions or 
deletions of one amino acid. The preferred amino acid changes are those which have little or no 
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influence on the biological activity or the capacity of the variant olfactory receptor polypeptide to 
bind to antibodies raised against a native olfactory receptor protein. 

A specific, but not restrictive, embodiment of a modified peptide molecule of interest 
according to the present invention, which consists in a peptide molecule which is resistant to 
5 proteolysis, is a peptide in which the -CONH- peptide bond is modified and replaced by a (CH2NH) 
reduced bond, a (NHCO) retro inverso bond, a (CH2-O) methylene-oxy bond, a (CH2-S) 
thiomethylene bond, a (CH2CH2) carba bond, a (CO-CH2) cetomethylene bond, a (CHOH-CH2) 
hydroxyethylene bond), a (N-N) bound, a E-alcene bond or also a -CH=CH- bond. 

The polypeptide according to the invention could have post-translational modifications. For 
10 example, it can present the following modifications: acylation, disulfide bond formation, 
prenylation, carboxymethylation and phosphorylation. 

A polypeptide fragment is a polypeptide which sequence is fully comprised within part of a 
given polypeptide sequence, preferably a polypeptide encoded by an olfactory receptor gene and 
variants thereof. 

15 Such fragments may be "free-standing", i.e. not part of or fused to other polypeptides, or 

they may be comprised within a single larger polypeptide of which they form a part or region. 
However, several fragments may be comprised within a single larger polypeptide. 

As representative examples of polypeptide fragments of the invention, there may be 
mentioned those which have from about 5, 6, 7, 8, 9 or 10 to 15, 10 to 20, 15 to 40, or 30 to 55 

20 amino acids long. Preferred polypeptide fragments according to the invention comprise a contiguous 
span of at least 6 amino acids, preferably at least 8 or amino acids, more preferably at least 12, 15, 
20, 25, 30, 40, 50, or 100 amino acids of one amino acid sequence. Preferred are those fragments 
containing at least one amino acid mutation in the olfactory receptor protein under consideration. 

Identity between nucleic acids or polypeptides 

25 The terms "percentage of sequence identity'' and "percentage homology" are used 

interchangeably herein to refer to comparisons among polynucleotides and polypeptides, and are 
determined by comparing two optimally aligned sequences over a comparison window, wherein the 
portion of the polynucleotide or polypeptide sequence in the comparison window may comprise 
additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise 

30 additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by 
determining the number of positions at which the identical nucleic acid base or amino acid residue 
occurs in both sequences to yield the number of matched positions, dividing the number of matched 
positions by the total number of positions in the window of comparison and multiplying the result by 
100 to yield the percentage of sequence identity. Homology is evaluated using either any of the 

35 variety of sequence comparison algorithms and programs known in the art, or by eye inspection. 
Such algorithms and programs include, but are by no means limited to, TBLASTN, BLASTP, 
FASTA, TFASTA, and CLUSTALW (Pearson and Lipman, 1988; Altschul et al., 1990; Thompson 
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etal., 1994; Higgins et al., 1996; Altschul et al., 1990; Altschul et al., 1993). In a particularly 
preferred embodiment, protein and nucleic acid sequence homologies are evaluated using the Basic 
Local Alignment Search Tool ("BLAST") which is well known in the art (see, e.g., Karlin and 
Altschul, 1990; Altschul et al., 1990, 1993, 1997). In particular, five specific BLAST programs are 
5 used to perform the following task: 

(1) BLASTP and BLAST3 compare an amino acid query sequence against a protein 
sequence database; 

(2) BLASTN compares a nucleotide query sequence against a nucleotide sequence 
database; 

10 (3) BLASTX compares the six-frame conceptual translation products of a query nucleotide 

sequence (both strands) against a protein sequence database; 

(4) TBLASTN compares a query protein sequence against a nucleotide sequence database 
translated in all six reading frames (both strands); and 

(5) TBLASTX compares the six-frame translations of a nucleotide query sequence against 
15 the six-frame translations of a nucleotide sequence database. 

The BLAST programs identify homologous sequences by identifying similar segments, 
which are referred to herein as "high-scoring segment pairs," between a query amino or nucleic acid 
sequence and a test sequence which is preferably obtained from a protein or nucleic acid sequence 
database. High-scoring segment pairs are preferably identified (i.e., aligned) by means of a scoring 

20 matrix, many of which are known in the art. Preferably, the scoring matrix used is the BLOSUM62 
matrix (Gonnet et al., 1992; Henikoff and Henikoff, 1993). Less preferably, the PAM or PAM250 
matrices may also be used (see, e.g., Schwartz and Dayhoff, eds., 1978). The BLAST programs 
evaluate the statistical significance of all high-scoring segment pairs identified, and preferably 
selects those segments which satisfy a user-specified threshold of significance, such as a user- 

25 specified percent homology. Preferably, the statistical significance of a high-scoring segment pair is 
evaluated using the statistical significance formula of Karlin (see, e.g., Karlin and Altschul, 1990). 
The BLAST programs may be used with the default parameters or with modified parameters 
provided by the user. 

Stringent Hybridization Conditions 

30 By way of example and not limitation, procedures using conditions of high stringency are as 

follows: Prehybridization of filters containing DNA is carried out for 8 h to overnight at eS'^C in 
buffer composed of 6X SSC, 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, 0.02% PVP, 0.02% FicoU, 
0.02% BSA, and 500 \ig/m\ denatured salmon sperm DNA. Filters are hybridized for 48 h at 65*^0, 
the preferred hybridization temperature, in prehybridization mixture containing 1 00 \xg/m\ denatured 

35 salmon sperm DNA and 5-20 X 10^ cpm of ^^P-labeled probe. Alternatively, the hybridization step 
can be performed at 65°C in the presence of SSC buffer, 1 x SSC corresponding to 0. 1 5M NaCl and 
0.05 M Na citrate. Subsequently, filter washes can be done at 37°C for 1 h in a solution containing 2 
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X SSC, 0.01% PVP, 0.01% Ficoll, and 0.01% BSA, followed by a wash in 0.1 X SSC at 50°C for 45 
min. Alternatively, filter washes can be performed in a solution containing 2 x SSC and 0.1% SDS, 
or 0.5 X SSC and 0.1% SDS, or 0.1 x SSC and 0.1% SDS at 68°C for 15 minute intervals. 
Following the wash steps, the hybridized probes are detectable by autoradiography. Other 
5 conditions of high stringency which may be used are well known in the art and as cited in Sambrook 
et ah, 1989; and Ausubel et al., 1989. These hybridization conditions are suitable for a nucleic acid 
molecule of about 20 nucleotides in length. There is no need to say that the hybridization conditions 
described above are to be adapted according to the length of the desired nucleic acid, following 
techniques well known to the one skilled in the art. The suitable hybridization conditions may for 
10 example be adapted according to the teachings disclosed in the book of Hames and Higgins (1985) 
or in Sambrook et al.(1989). 

HOMOLOGIES OF THE NOVEL OLFACTORY RECEPTOR GENE WITH 
KNOWN OLFACTORY RECEPTORS 

A comparison analysis of various olfactory receptor amino acid sequences, including the 
1 5 novel sequences of the invention, has been performed with the alignment program Pileup and the 
translation program MAP (Winsconsin Package version 8, GCG). The protein sequences were sorted 
into different families and subfamilies, taking into account their Amino acid Sequence Identity 
(ASI). It was observed the Open Reading Frames of the OLFl to OLFIO genes are genetically 
clearly distinguished from the already known olfactory receptor sequences. For example, the 
20 olfactory receptor OLF2 presents respectively 39.9 %, 43. 1 % and 44.2 % of identity with prior art 
olfactory receptors referred in Genbank as L35475, U58675_l and Yl 0530. In addition, the 
nucleotide sequences of Orf-2 to Orf-10 according to the invention are all grouped together, whereas 
the nucleotide Orf-1 of the invention forms a new family by itself. These amino acid sequence 
comparison data clearly indicate that the novel olfactory receptor sequences of the invention share 
25 common genetic characteristics (Orf-2 to Orf-10) or have specific characteristics (Orf-l) that are not 
found in the prior art olfactory receptor sequences. 

A. OLFl TO OLFIO GENE POLYNUCLEOTIDES. 

The cluster of ten olfactory receptor genes has been found by the inventors to be located on 
the human chromosome 1 1, more precisely within the 1 Iql2-ql3 locus of said chromosome as 
30 described in Example 1 . 

1. Genomic sequences of the olfactory receptor gene 

The present invention concerns the genomic sequence of an olfactory receptor cluster. The 
present invention encompasses the olfactory receptor gene, or olfactory receptor genomic sequences 
consisting of, consisting essentially of, or comprising the sequence of SEQ ID No 1, a sequence 
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complementary thereto, as well as fragments and variants thereof. These polynucleotides may be 
purified, isolated, or recombinant. 

The invention also encompasses a purified, isolated, or recombinant polynucleotide 
comprising a nucleotide sequence having at least 70, 75, 80, 85, 90, or 95% nucleotide identity with 
5 a nucleotide sequence of SEQ ID No 1 or a complementary sequence thereto or a fragment thereof. 
The nucleotide differences as regards to the nucleotide sequence of SEQ ID No 1 may be generally 
randomly distributed throughout the entire nucleic acid. Nevertheless, preferred nucleic acids are 
those wherein the nucleotide differences as regards to the nucleotide sequence of SEQ ID No 1 are 
predominantly located outside the coding sequences contained in the exons. These nucleic acids, as 

10 well as their fragments and variants, may be used as oligonucleotide primers or probes in order to 
detect the presence of a copy of the olfactory receptor gene in a test sample, or alternatively in order 
to amplify a target nucleotide sequence within the olfactory receptor sequences. 

Another object of the invention consists of a purified, isolated, or recombinant nucleic acid 
that hybridizes with the nucleotide sequence of SEQ ID No 1 or a complementary sequence thereto, 

15 under stringent hybridization conditions as defined above. 

Particularly preferred nucleic acids of the invention include isolated, purified, or 
recombinant polynucleotides comprising a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 
50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of SEQ ID No 1 or the complements 
thereof, wherein said contiguous span comprises at least 1, 2, 3, 5, or 10 of the following nucleotide 

20 positions of SEQ ID No 1: 1-1 13643, 1 14064-127488, 127855-144460. Additional preferred nucleic 
acids of the invention include isolated, purified, or recombinant polynucleotides comprising a 
contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 
1000 nucleotides of SEQ ID No 1 or the complements thereof, wherein said contiguous span 
comprises at least 1, 2, 3, 5, or 10 of the following nucleotide positions of SEQ ED No 1 : 1-10000, 

25 10001-20000, 20001-30000, 30001-40000, 40001-50000, 50001-60000, 60001-70000, 70001- 
80000, 80001-90000, 90001-100000, 100001-110000, 110001-120000, 120001-130000, 130001- 
140000, and 140001-144460. Further preferred nucleic acids of the invention include isolated, 
purified, or recombinant polynucleotides comprising a contiguous span of at least 12, 15, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of SEQ ID No 1 or the 

30 complements thereof, wherein said contiguous span comprises at least 1, 2, 3, 5, or 10 of the 

following nucleotide positions of SEQ ID No 1: 1-5000, 5001-10000, 10001-15000, 15001-20000, 
20001-25000, 25001-30000, 30001-35000, 35001-40000, 40001-45000, 45001-50000, 50001- 
55000, 55001-60000, 60001-65000, 65001-70000, 70001-75000, 75001-80000, 80001-85000, 
85001-90000, 90001-95000, 95001-100000, 100001-105000, 105001-110000, 110001-115000, 

35 115001-120000, 120001-125000, 125001-130000, 130001-135000, 135001-140000, and 140001- 
144460. 
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The olfactory receptor genomic nucleic acid comprises 10 open reading frames, each carried 
by a single exon and encoding a polypeptide designated OLFl to OLFIO. The open reading frames 
positions of OLFl to OLFIO in SEQ ED No 1 are given as features in the sequence listing and are 
also detailed below in Table A. 
5 Two truncated ubiquitin polypeptides Ubi 1 and Ubi2, unrelated to olfactory receptor coding 

sequences, are encoded on the complementary strand of the olfactory receptor gene. The 
complementary sequence of the Ubil ORF is located between the nucleotide in position 1 14063 and 
the nucleotide in position 1 13644 of the nucleotide sequence of SEQ ID No 1. The complementary 
sequence of the Ubi2 ORF is located between the nucleotide in position 127854 and the nucleotide 
10 in position 127489 of the nucleotide sequence of SEQ ID No 1 . 

Table A 



Coding regions 


Non-coding regions 


Name 


Position in SEQ ID No 1 


Name 


Position in SEQ ID No 1 




Beginning 


End 




Beginning 


End 


OLFl 


2406 


2600 


NCI 


1 


2405 


OLF2 


9711 


10658 


NC2 


2601 


9710 


OLF3 


24851 


25369 


NC3 


10659 


24850 


OLF4 


45714 


46661 


NC4 


25370 


45713 


OLF5 


80198 


81115 


NC5 


46662 


80197 


OLF6 


96291 


96902 


NC6 


81116 


96290 


OLF7 


110758 


111564 


NC7 


96903 


110757 


OLF8 


122525 


122887 


NC8 


111565 


122524 


OLF9 


132454 


133389 


NC9 


122888 


132453 


OLFIO 


143398 


143577 


NCIO 


133390 


143397 








NCll 


143578 


144460 



Thus, the invention embodies purified, isolated, or recombinant polynucleotides comprising 
a nucleotide sequence selected from the group consisting of the 10 open reading frames of the 
15 olfactory receptor gene, or a sequence complementary thereto. 

The nucleic acid of SEQ ID No 1 also comprises non coding portions flanking each of the 
ten olfactory receptor open reading frames of the sense DNA strand. 

The invention also embodies purified, isolated, or recombinant polynucleotides comprising a 
nucleotide sequence selected from the group consisting of the non-coding regions contained in the 
20 olfactory receptor gene cluster of SEQ ID No 1, or a sequence complementary thereto as well as 
their fragments or variants. The term "non-coding" sequence refers to any nucleotide sequence 
which does not encode an amino acid. The non-coding sequences encompass upstream and 
downstream regions of the olfactory receptor ORFs of the invention, as well as regions located 
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between two successive olfactory receptor ORFs, as indicated in Table A which hsts the 1 1 non- 
coding regions named from NC 1 to NCI 1 . 

The nucleic acids defining the non-coding sequences of the polynucleotide of SEQ ID No 1 
described above, as well as their fragments and variants, may be used as oligonucleotide primers or 
5 probes in order to detect the presence of a copy of one of the olfactory receptor genes of the 

invention in a test sample, or alternatively in order to amplify a target nucleotide sequence within the 
cluster of olfactory receptor encoding sequences according to the invention. 

While this section is entitled "Genomic Sequences of the olfactory receptor gene," it should 
be noted that nucleic acid fragments of any size and sequence may also be comprised by the 
10 polynucleotides described in this section, flanking the genomic sequences of olfactory receptor on 
either side or between two or more such genomic sequences. 

2. Coding regions of the olfactory receptor gene 

The 10 olfactory receptor open reading frames are presented individually as SEQ ID Nos 2- 
1 1 in the appended sequence listing. 

15 Thus, another object of the invention is a purified, isolated, or recombinant nucleic acid 

comprising a nucleotide sequence selected from the group consisting of SEQ ID Nos 2-11, 
complementary sequences thereto, as well as allelic variants, and fragments thereof. Moreover, 
preferred polynucleotides of the invention include purified, isolated, or recombinant olfactory 
receptor cDNAs consisting of, consisting essentially of, or comprising a sequence selected from the 

20 group consisting of SEQ ID Nos 2-1 1 . 

The invention also pertains to a purified or isolated nucleic acid comprising a polynucleotide 
having at least 95% nucleotide identity with a polynucleotide selected from the group consisting of 
SEQ ID Nos 2-11, advantageously 99 % nucleotide identity, preferably 99.5% nucleotide identity 
and most preferably 99.8% nucleotide identity with a polynucleotide selected from the group 

25 consisting of SEQ ID Nos 2-1 1 , or a sequence complementary thereto or a biologically active 
fragment thereof. 

Another object of the invention relates to purified, isolated or recombinant nucleic acids 
comprising a polynucleotide that hybridizes, imder the stringent hybridization conditions defined 
herein, with a polynucleotide selected from the group consisting of SEQ ID Nos 2-1 1, or a sequence 

30 complementary thereto or a biologically active fragment thereof. 

Particularly preferred nucleic acids of the invention include isolated, purified, or 
recombinant polynucleotides comprising a contiguous span of at least 12, 15,18, 20, 25, 30, 35, 40, 
50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of a sequence selected from the group 
consisting of SEQ ID Nos 2-11 or the complements thereof. Additional preferred embodiments of 

35 the invention include isolated, purified, or recombinant polynucleotides comprising a contiguous 
span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 
nucleotides of a sequence selected from the group consisting of SEQ ID Nos 2-1 1 or the 



wo 00/21985 PCT/IB99/01729 

18 

complements thereof, wherein said contiguous span comprises at least 1, 2, 3, 5, or 10 of the 
following nucleotide positions of said selected sequence : 1-50, 51-100, 101-150, 151-200,201-250, 
251-300, 301-350, 351-400, 401-450, 451-500, 501-550, 551-600, 601-650, 651-700, 701-750, 751- 
800, 801-850, 851-900, 901- the terminal nucleotide of the olfactory receptor coding regions, to the 
5 extent that such nucleotide positions are consistent with the lengths of the particular olfactory 
receptor coding region being referred to. Further preferred embodiments of the invention include 
isolated, purified, or recombinant polynucleotides comprising a contiguous span of at least 12, 15, 
18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of a sequence 
selected from the group consisting of SEQ ID Nos 2, 4, 7, 9 and 1 1, or the complements thereof, 

10 wherein said contiguous span comprises at least 1, 2, 3, 5, or 10 of the following nucleotide 

positions of said selected sequence: 1-25, 26-50, 51-75, 76-100, 101-125, 126-150, 151-175, 176- 
200, 201-225, 226-250, 251-275, 276-300, 301-325, 326-350, 351-375, 376-400, 401-425, 426-450, 
451-475, 476-500, 501-525, 526-550, 551-575, 576-the terminal nucleotide of the olfactory receptor 
coding regions, to the extent that such nucleotide positions are consistent with the lengths of the 

1 5 particular olfactory receptor coding region being referred to. 

The present invention also embodies isolated, purified, and recombinant polynucleotides 
encoding olfactory receptor polypeptides, wherein olfactory receptor polypeptides comprise an 
amino acid sequence selected from the group consisting of SEQ ID Nos 12-21, a nucleotide 
sequence complementary thereto, a fragment or a variant thereof. The present invention also 

20 embodies isolated, purified, and recombinant polynucleotides which encode polypeptides 

comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more 
preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a sequence selected from the 
group consisting of SEQ ID Nos 12-21 . In a preferred embodiment, the present invention embodies 
isolated, purified, and recombinant polynucleotides which encode polypeptides comprising a 

25 contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at 
least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a sequence selected from the group consisting 
of SEQ ID Nos 12-21 wherein said contiguous span includes at least 1, 2, 3, 5 or 10 of the following 
amino acid positions in said selected sequence: 1-20, 21-40, 41-60, 61-80, 81-100, 101-120, 121- 
140, 141-160, 161-180, 181-200, 201-220, 221-240, 241-260, 261-280, 281-300, 301 -the terminal 

30 amino acid of the olfactory receptor proteins, to the extent that such amino acid positions are 

consistent with the lengths of the particular olfactory receptor protein being referred to. In another 
preferred embodiment, the present invention embodies isolated, purified, and recombinant 
polynucleotides which encode polypeptides comprising a contiguous span of at least 6 amino acids, 
preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 

35 amino acids of a sequence selected from the group consisting of SEQ ID Nos 12, 14, 17, 19 or 21 
wherein said contiguous span includes at least 1, 2, 3, 5 or 7 of the following amino acid positions in 
said selected sequence: 1-10, 11-20, 21-30, 31-40, 41-50, 51-60, 61-70, 71-80, 81-90, 91-100, 101- 
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110, 111-120, 121-130, 131-140, 141-150, 151-160, 161-170, 171-180, 181-190, 191 -the terminal 
amino acid of the olfactory receptor proteins, to the extent that such amino acid positions are 
consistent with the lengths of the particular olfactory receptor protein being referred to. 

In further preferred embodiments, the present invention embodies isolated, piu-ified, and 
5 recombinant polynucleotides which encode olfactory receptor polypeptides comprising a contiguous 
span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 
15, 20, 25, 30, 40, 50, or 100 amino acids of a sequence selected from the group consisting of SEQ 
ID No 12-21, wherein said contiguous span includes at least one amino acid at the following 
positions of said selected sequence: 
10 i) 1-3, 10, 16, 21, 28, 33, 34, 36, 42-44, 46, 49, 53, 54, 57, 59, 63, and 64 for SEQ ID 

No 12; 

ii) 2, 4, 6, 8, 18, 25, 34, 37, 44, 52, 56, 80, 83, 89, 98, 101, 102, 1 13, 1 14, 117, 120, 

139, 148, 158, 186, 195, 212, 219, 247, 266, 270, 280, 295, 298, 299, 301, 311, and 

313-315 for SEQ ID No 13; 
15 iii) 2-4, 6, 18, 21, 25, 34, 37, 98, 99, 102, 1 13, 1 14, 133, 143, 148, 158-163, 166, 167, 

169, and 170 for SEQ ID No 14; 
iv) 2,4,6, 8, 18,25,34,37,44,52,54, 56, 80, 83, 89,98, 101, 102, 113, 114, 117, 120, 

139, 148, 158, 186, 195,212,219,247, 266, 270,280, 298, 299, 31 1, and 313-315 

for SEQ ID No 15; 

20 v) 3, 18, 20, 25, 34, 47, 49, 67, 97, 100, 107. 108, 1 12, 113, 126, 135, 142, 146, 147, 

157, 159-160, 194, 196, 228, 245, 264, 265, 269, 279, 298, and 302 for SEQ ID No 
16; 

vi) 2, 6, 18, 20, 33, 34, 37, 65, 68, 69, 72, 86, 88, 101, 107, 113, 114, 148, 158, 161, 
164, 195, and 198 for SEQ ID No 17; 
25 vii) 2, 6, 7, 52, 56, 67, 88, 94, 97, 110, 113. 116, 119, 120, 127, 135, 150, 153, 164, 174, 

175, 180. 184, 217, 221, 259, 261, and 268 for SEQ ID No 18; 

viii) 17, 18, 20, 28, 33, 35. 49-52, 105. 1 1 1, and 1 12 for SEQ ID No 19; 

ix) 17.20,33.35.49-53, 56. Ill, 112, 132. 138, 141, 147, 154. 157. 160, 163, 164, 
194, 197, 204, 211, 214, 218, 219, 252, 265, 286, 295, 301, 303, 305, 306 and 309 

30 for SEQ ID No 20; and 

x) 9, 1 8, 26-28, 34, 47 and 50 for SEQ ID No 2 1 , to the extent that such amino acid 
lengths are consistent with the lengths of the particular olfactory receptor protein 
being referred to. 

Additional preferred fragments of the nucleotide sequences of SEQ ID Nos 2-1 1 are those 
35 encoding ol&ctory receptor polypeptide fragments located outside the transmembrane domains of 
the correspcmding protein as located in boxes in Figure 1 . 



wo 00/21985 PCT/IB99/01729 

20 

The above disclosed polynucleotides that contain only coding sequences derived from the 
olfactory receptor ORFs may be expressed in a desired host cell or a desired host organism, when 
said polynucleotides are placed under the control of suitable expression signals. Such a 
polynucleotide, when placed under suitable expression signals, may be inserted in a vector for its 
5 expression. 

While this section is entitled " Coding regions of the olfactory receptor gene," it should be 
noted that nucleic acid fragments of any size and sequence may also be comprised by the 
polynucleotides described in this section, flanking the genomic sequences of olfactory receptor on 
either side or between two or more such genomic sequences. 

10 3. Polynucleotide Constructs 

The terms "polynucleotide construct" and "recombinant polynucleotide" are used 
interchangeably herein to refer to linear or circular, purified or isolated polynucleotides that have 
been artificially designed and which comprise at least two nucleotide sequences that are not found as 
contiguous nucleotide sequences in their initial natural environment. 

15 DNA Construct That Enables Directing Temporal And Spatial olfactory receptor Gene Expression 
In Recombinant Cell Hosts And In Transgenic Animals. 

In order to study the physiological and phenotypic consequences of a lack of synthesis of the 
olfactory receptor protein, both at the cell level and at the multi cellular organism level, the 
invention also encompasses DNA constructs and recombinant vectors enabling a conditional 

20 expression of a specific allele of the olfactory receptor genomic sequence or cDNA and also of a 
copy of this genomic sequence or cDNA harboring substitutions, deletions, or additions of one or 
more bases as regards to the olfactory receptor nucleotide sequence of SEQ ID Nos 1-1 1, or a 
fragment thereof, these base substitutions, deletions or additions being located in the coding regions 
of the olfactory receptor genomic sequence or within the olfactory receptor open reading frames of 

25 SEQ ID Nos 2-11. In a preferred embodiment, the olfactory receptor sequence comprises a biallelic 
marker of the present invention. In a preferred embodiment, the olfactory receptor sequence 
comprises a biallelic marker of the present invention, preferably one of the biallelic markers Al to 
A13. 

The present invention embodies recombinant vectors comprising any one of the 
30 polynucleotides described in the present invention. More particularly, the polynucleotide constructs 
according to the present invention can comprise any of the polynucleotides described in the 
"Genomic sequences of the olfactory receptor gene" section, the "Coding regions of the olfactory 
receptor Gene" section, and the "Oligonucleotide probes and primers" section. 
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DNA Constructs Allowing Homologous Recombination: Replacement Vectors 

A first preferred DNA construct will comprise, from 5 '-end to 3 '-end: (a) a first nucleotide 

sequence that is comprised in the olfactory receptor genomic sequence; (b) a nucleotide sequence 

comprising a positive selection marker, such as the marker for neomycine resistance (neo); and (c) a 
5 second nucleotide sequence that is comprised in the olfactory receptor genomic sequence, and is 

located on the genome downstream the first olfactory receptor nucleotide sequence (a). 

In a preferred embodiment, this DNA construct also comprises a negative selection marker 

located upstream the nucleotide sequence (a) or downstream the nucleotide sequence (c). 

Preferably, the negative selection marker comprises the thymidine kinase (tk) gene (Thomas et al., 
10 1986), the hygromycine beta gene (Te Riele et al., 1990), the hprt gene ( Van der Lugt et al., 1991; 

Reid et al., 1990) or the Diphteria toxin A fragment (DUA) gene (Nada et al., 1993; Yagi et al.l990). 

Preferably, the positive selection marker is located within an olfactory receptor open reading frame 

sequence so as to interrupt the sequence encoding an olfactory receptor protein. These replacement 

vectors are described, for example, by Thomas et al.(1986; 1987), Mansour et aL(1988) and KoUer 
15 etal.(1992). 

The first and second nucleotide sequences (a) and (c) may be indifferently located within an 
olfactory receptor regulatory sequence, an intronic sequence, an exon sequence or a sequence 
containing both regulatory and/or intronic and/or exon sequences. The size of the nucleotide 
sequences (a) and (c) ranges from 1 to 50 kb, preferably from 1 to 10 kb, more preferably from 2 to 
20 6 kb and most preferably from 2 to 4 kb. 

DNA Constructs Allowing Homologous Recombination: Cre-LoxP System. 

These new DNA constructs make use of the site specific recombination system of the PI 
phage. The PI phage possesses a recombinase called Cre which interacts specifically with a 34 base 
pairs loxP site. The loxP site is composed of two palindromic sequences of 13 bp separated by a 8 

25 bp conserved sequence (Hoess et al., 1986). The recombination by the Cre enzyme between two 
loxP sites having an identical orientation leads to the deletion of the DNA fragment. 

The Cre-loxP system used in combination with a homologous recombination technique has 
been first described by Gu et al.(1993, 1994). Briefly, a nucleotide sequence of interest to be 
inserted in a targeted location of the genome harbors at least two loxP sites in the same orientation 

30 and located at the respective ends of a nucleotide sequence to be excised from the recombinant 
genome. The excision event requires the presence of the recombinase (Cre) enzyme within the 
nucleus of the recombinant cell host. The recombinase enzyme may be brought at the desired time 
either by (a) incubating the recombinant cell hosts in a culture medium containing this enzyme, by 
injecting the Cre enzyme directly into the desired cell, such as described by Araki et al.(1995), or by 

35 lipofection of the enzyme into the cells, such as described by Baubonis et al.(1993); (b) transfecting 
the cell host with a vector comprising the Cre coding sequence operably linked to a promoter 
functional in the recombinant cell host, which promoter being optionally inducible, said vector being 
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introduced in the recombinant cell host, such as described by Gu et al.(1993) and Sauer et al.(1988); 
(c) introducing in the genome of the cell host a polynucleotide comprising the Cre coding sequence 
operably linked to a promoter functional in the recombinant cell host, which promoter is optionally 
inducible, and said polynucleotide being inserted in the genome of the cell host either by a random 
5 insertion event or an homologous recombination event, such as described by Gu et al.(1994). 

In a specific embodiment, the vector containing the sequence to be inserted in the olfactory 
receptor gene by homologous recombination is constructed in such a way that selectable markers are 
flanked by lox? sites of the same orientation, it is possible, by treatment by the Cre enzyme, to 
eliminate the selectable markers while leaving the olfactory receptor sequences of interest that have 

10 been inserted by an homologous recombination event. Again, two selectable markers are needed: a 
positive selection marker to select for the recombination event and a negative selection marker to 
select for the homologous recombination event. Vectors and methods using the Cre-ZoxP system are 
described by Zou et aL(1994). 

Thus, a second preferred DNA construct of the invention comprises, from 5 '-end to 3 '-end: 

15 (a) a first nucleotide sequence that is comprised in the olfactory receptor genomic sequence; (b) a 
nucleotide sequence comprising a polynucleotide encoding a positive selection marker, said 
nucleotide sequence comprising additionally two sequences defining a site recognized by a 
recombinase, such as a loxV site, the two sites being placed in the same orientation; and (c) a second 
nucleotide sequence that is comprised in the olfactory receptor genomic sequence, and is located on 

20 the genome downstream of the first olfactory receptor nucleotide sequence (a). 

The sequences defining a site recognized by a recombinase, such as a lox? site, are 
preferably located within the nucleotide sequence (b) at suitable locations bordering the nucleotide 
sequence for which the conditional excision is sought. In one specific embodiment, two lox? sites 
are located at each side of the positive selection marker sequence, in order to allow its excision at a 

25 desired time after the occurrence of the homologous recombination event. 

In a preferred embodiment of a method using the third DNA construct described above, the 
excision of the polynucleotide fragment bordered by the two sites recognized by a recombinase, 
preferably two loxP sites, is performed at a desired time, due to the presence within the genome of 
the recombinant host cell of a sequence encoding the Cre enzyme operably linked to a promoter 

30 sequence, preferably an inducible promoter, more preferably a tissue-specific promoter sequence and 
most preferably a promoter sequence which is both inducible and tissue-specific, such as described 
by Gu et al.(1994). 

The presence of the Cre enzyme within the genome of the recombinant cell host may result 
from the breeding of two transgenic animals, the first transgenic animal bearing the olfactory 
35 receptor-derived sequence of interest containing the lox? sites as described above and the second 
transgenic animal bearing the Cre coding sequence operably linked to a suitable promoter sequence, 
such as described by Gu et al.(1994). 
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Spatio-temporal control of the Cre enzyme expression may also be achieved with an 
adenovirus based vector that contains the Cre gene thus allowing infection of cells, or in vivo 
infection of organs, for delivery of the Cre enzyme, such as described by Anton and Graham (1995) 
and Kanegae et al.(1995). 
5 The DNA constructs described above may be used to introduce a desired nucleotide 

sequence of the invention, preferably an olfactory receptor genomic sequence or an olfactory 
receptor coding region sequences, and most preferably an altered copy of an olfactory receptor 
genomic or coding region sequences, within a predetermined location of the targeted genome, 
leading either to the generation of an altered copy of a targeted gene (knock-out homologous 

10 recombination) or to the replacement of a copy of the targeted gene by another copy sufficiently 
homologous to allow an homologous recombination event to occur (knock-in homologous 
recombination). In a specific embodiment, the DNA constructs described above may be used to 
introduce an olfactory receptor genomic sequence or an olfactory receptor coding region sequence 
comprising at least one biallelic marker of the present invention, preferably at least one biallelic 

15 marker selected from the group consisting of Al to A 13. 

Nuclear Antisense DNA Constructs 

Other compositions containing a vector of the invention comprising an oligonucleotide 
fragment of the nucleic sequence SEQ ID Nos 2-11, preferably a fragment including the start codon 
of the olfactory receptor gene, as an antisense tool that inhibits the expression of the corresponding 

20 olfactory receptor gene. Preferred methods using antisense polynucleotide according to the present 
invention are the procedures described by Sczakiel et al.(1995) or those described in PCT 
Application No WO 95/24223. 

Preferred antisense polynucleotides according to the present invention are complementary to 
a sequence of the mRNAs of olfactory receptor that contains the translation initiation codon ATG. 

25 Preferably, the antisense polynucleotides of the invention have a 3' polyadenylation signal 

that has been replaced with a self-cleaving ribozyme sequence, such that RNA polymerase II 
transcripts are produced without poly(A) at their 3' ends, these antisense polynucleotides being 
incapable of export from the nucleus, such as described by Liu et al.(1994). In a preferred 
embodiment, these olfactory receptor antisense polynucleotides also comprise, within the ribozyme 

30 cassette, a histone stem-loop structure to stabilize cleaved transcripts against 3 '-5' exonucleolytic 
degradation, such as the structure described by Eckner et al.(1991). 

4. Oligonucleotide probes and primers 

Polynucleotides derived from the olfactory receptor gene are useful in order to detect the 
presence of at least a copy of a nucleotide sequence of SEQ ID Nos 1-11, or a fragment, 
35 complement, or variant thereof in a test sample, preferably a human olfactory epithelium tissue or 
isolated human olfactory epithelium cells. 
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Particularly preferred probes and primers of the invention include isolated, purified, or 
recombinant polynucleotides comprising a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 
50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of SEQ ID No 1 or the complements 
thereof, wherein said contiguous span comprises at least 1, 2, 3, 5, or 10 of the following nucleotide 
5 positions of SEQ ID No 1: 1-1 13643, 1 14064-127488, 127855-144460. Additional preferred probes 
and primers of the invention include isolated, purified, or recombinant polynucleotides comprising a 
contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 
1000 nucleotides of SEQ ID No 1 or the complements thereof, wherein said contiguous span 
comprises at least 1, 2, 3, 5, or 10 of the following nucleotide positions of SEQ ID No 1 : 1-10000, 

10 10001-20000, 20001-30000, 30001-40000, 40001-50000, 50001-60000, 60001-70000, 70001- 
80000, 80001-90000, 90001-100000, 100001-110000, 110001-120000, 120001-130000, 130001- 
140000, and 140001-144460. Further preferred probes and primers of the invention include isolated, 
purified, or recombinant polynucleotides comprising a contiguous span of 12, 15, 18, 20, 25, 30, 35, 
40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of SEQ ID No 1 or the complements 

15 thereof, wherein said contiguous span comprises at least 1, 2, 3, 5, or 10 of the following nucleotide 
positions of SEQ ID No 1: 1-5000, 5001-10000, 10001-15000, 15001-20000, 20001-25000, 25001- 
30000, 30001-35000, 35001-40000, 40001-45000, 45001-50000, 50001-55000, 55001-60000, 
60001-65000, 65001-70000, 70001-75000, 75001-80000, 80001-85000, 85001-90000, 90001- 
95000, 95001-100000, 100001-105000, 105001-110000, 110001-115000, 115001-120000, 120001- 

20 125000, 125001-130000, 130001-135000, 135001-140000, and 140001-144460. 

Other particularly preferred probes and primers of the invention include isolated, purified, or 
recombinant polynucleotides comprising a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 
45 or 50 nucleotides of a sequence selected from the group consisting of SEQ ID Nos 2-1 1 or the 
complements thereof, wherein said contiguous span comprises at least 1, 2, 3, 5, or 10 of the 

25 following nucleotide positions of said selected sequence : 1-50, 51-100, 101-150, 151-200, 201-250, 
251-300, 301-350, 351-400, 401-450, 451-500, 501-550, 551-600, 601-650, 651-700, 701-750, 751- 
800, 801-850, 851-900, 901- the terminal nucleotide of the olfactory receptor coding regions, to the 
extent that such nucleotide positions are consistent with the lengths of the particular olfactory 
receptor coding region being referred to. Further preferred probes and primers of the invention 

30 include isolated, purified, or recombinant polynucleotides comprising a contiguous span of at least 
12, 15, 18, 20, 22 or 25 nucleotides of a sequence selected from the group consisting of SEQ ID Nos 

2, 4, 7, 9 and 1 1, or the complements thereof, wherein said contiguous span comprises at least 1, 2, 

3, 5, or 10 of the following nucleotide positions of said selected sequence: 1-25, 26-50, 51-75, 76- 
100, 101-125, 126-150, 151-175, 176-200, 201-225, 226-250, 251-275, 276-300, 301-325, 326-350, 

35 351-375, 376-400, 401-425, 426-450, 451-475, 476-500, 501-525, 526-550, 551-575, 576-the 
terminal nucleotide of the olfactory receptor coding regions, to the extent that such nucleotide 
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positions are consistent with the lengths of the particular olfactory receptor coding region being 
referred to. 

Thus, the invention also relates to nucleic acid probes characterized in that they hybridize 
specifically, under the stnngent hybridization conditions defined above, with a nucleic acid selected 
5 from the group consisting of SEQ ID Nos 1-11, a variant thereof and a sequence complementary 
thereto. 

In one embodiment the invention encompasses isolated, purified, and recombinant 
polynucleotides consisting of, or consisting essentially of a contiguous span of 8 to 50 nucleotides of 
SEQ ID No 1 and the complement thereof, wherein said span includes an olfactory receptor-related 

10 biallelic marker in said sequence; optionally, wherein said olfactory receptor-related biallelic 

marker is selected from the group consisting of Al to A13, and the complements thereof; optionally, 
wherein said contiguous span is 18 to 47 nucleotides in length and said biallelic marker is within 4 
nucleotides of the center of said polynucleotide; optionally, wherein said polynucleotide consists of 
said contiguous span and said contiguous span is 25 nucleotides in length and said biallelic marker is 

15 at the center of said polynucleotide; optionally, wherein the 3* end of said contiguous span is present 
at the 3' end of said polynucleotide; and optionally, wherein the 3' end of said contiguous span is 
located at the 3' end of said polynucleotide and said biallelic marker is present at the 3' end of said 
polynucleotide. In a preferred embodiment, said probes comprises, consists of, or consists 
essentially of a sequence selected from the following sequences: PI to P13 and the complementary 

20 sequences thereto, for which the respective locations in the sequence listing are provided in Table 3. 
In another embodiment the invention encompasses isolated, purified and recombinant 
polynucleotides comprising, consisting of, or consisting essentially of a contiguous span of 8 to 50 
nucleotides of SEQ ID No 1, or the complements thereof, wherein the 3' end of said contiguous span 
is located at the 3' end of said polynucleotide, and wherein the 3' end of said polynucleotide is 

25 located within 20 nucleotides upstream of an olfactory receptor-related biallelic marker in said 
sequence; optionally, wherein said olfactory receptor-related biallelic marker is selected from the 
group consisting of Al to A13, and the complements thereof; optionally, wherein the 3' end of said 
polynucleotide is located 1 nucleotide upstream of said olfactory receptor-related biallelic marker in 
said sequence; and optionally, wherein said polynucleotide consists essentially of a sequence 

30 selected from the following sequences: Dl to D13 and El to E13, for which the respective locations 
in the sequence listing are provided in Table 4. 

In a further embodiment, the invention encompasses isolated, purified, or recombinant 
polynucleotides comprising, consisting of, or consisting essentially of a sequence selected from the 
following sequences: Bl to Bl 1 and CI to CI 1, for which the respective locations in the sequence 

35 listing are provided in Table 1 . 

In an additional embodiment, the invention encompasses polynucleotides for use in 
hybridization assays, sequencing assays, and enzyme-based mismatch detection assays for 
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determining the identity of the nucleotide at an olfactory receptor-related biallelic marker in SEQ ID 
No 1, or the complements thereof, as well as polynucleotides for use in amplifying segments of 
nucleotides comprising an olfactory receptor-related biallelic marker in SEQ ID No 1, or the 
complements thereof; optionally, wherein said olfactory receptor-related biallelic marker is selected 
5 from the group consisting of A 1 to A 13, and the complements thereof. 

A probe or a primer according to the invention has between 8 and 1 000 nucleotides in 
length, or is specified to be at least 12, 15, 18, 20, 25, 35, 40, 50, 60, 70, 80, 100, 250, 500 or 1000 
nucleotides in length. More particularly, the length of these probes and primers can range from 8, 
10, 15, 20, or 30 to 100 nucleotides, preferably from 10 to 50, more preferably from 15 to 30 

10 nucleotides. Shorter probes and primers tend to lack specificity for a target nucleic acid sequence 
and generally require cooler temperatures to form sufficiently stable hybrid complexes with the 
template. Longer probes and primers are expensive to produce and can sometimes self-hybridize to 
form hairpin structures. The appropriate length for primers and probes under a particular set of 
assay conditions may be empirically determined by one of skill in the art. A preferred probe or 

15 primer consists of a nucleic acid comprising a polynucleotide selected from the group of the 

nucleotide sequences of PI to P13 and the complementary sequence thereto, Bl to Bl 1, CI to CI 1, 
Dl toD13, andEl toE13. 

Primers and other oligonucleotides according to the invention are synthesized to be 
"substantially" complementary to a strand of the olfactory receptor gene of the invention to be 

20 amplified. The primer sequence does not need to reflect the exact sequence of the DNA template. 
Minor mismatches can be accommodated by reducing the stringency of the hybridization conditions. 
Among the various methods available to design useful primers, the OSP computer software can be 
used by the skilled person (see Hillier & Green, 1991). Ail primers contained a common upstream 
oligonucleotide tail enabling the easy systematic sequencing of the resulting amplification 

25 fragments. 

The formation of stable hybrids depends on the melting temperature (Tm) of the DNA. The 
Tm depends on the length of the primer or probe, the ionic strength of the solution and the G+C 
content. The higher the G+C content of the primer or probe, the higher is the melting temperature 
because G:C pairs are held by three H bonds whereas A:T pairs have only two. The GC content in 

30 the probes of the invention usually ranges between 10 and 75 %, preferably between 35 and 60 %, 
and more preferably between 40 and 55 %. 

The primers and probes can be prepared by any suitable method, including, for example, 
cloning and restriction of appropriate sequences and direct chemical synthesis by a method such as 
the phosphodiester method of Narang et al.(1979), the phosphodiester method of Brown et al.(1979), 

35 the diethylphosphoramidite method of Beaucage et al.( 1 98 1 ) and the solid support method described 
in EP 0 707 592. 
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Detection probes are generally nucleic acid sequences or uncharged nucleic acid analogs 
such as, for example peptide nucleic acids which are disclosed in International Patent Application 
WO 92/20702, morpholino analogs which are described in U.S. Patents Numbered 5,185,444; 
5,034,506 and 5,142,047. The probe may have to be rendered ''non-extendable" in that additional 
5 dNTPs cannot be added to the probe. In and of themselves analogs usually are non-extendable and 
nucleic acid probes can be rendered non-extendable by modifying the 3' end of the probe such that 
the hydroxyl group is no longer capable of participating in elongation. For example, the 3' end of 
the probe can be functionalized with the capture or detection label to thereby consume or otherwise 
block the hydroxyl group. Alternatively, the 3' hydroxyl group simply can be cleaved, replaced or 

10 modified, U.S. Patent Application Serial No. 07/049,061 filed April 19, 1993 describes 
modifications, which can be used to render a probe non-extendable. 

Any of the polynucleotides of the present invention can be labeled, if desired, by 
incorporating any label knovm in the art to be detectable by spectroscopic, photochemical, 
biochemical, immunochemical, or chemical means. For example, useful labels include radioactive 

15 substances (including, ^"P, ^^S, ^H, ^^^I), fluorescent dyes (including, 5-bromodesoxyuridin, 

fluorescein, acetylaminofluorene, digoxigenin) or biotin. Preferably, polynucleotides are labeled at 
their 3' and 5' ends. Examples of non-radioactive labeling of nucleic acid fragments are described 
in the French patent No. FR-78 10975 or by Urdea et al (1988) or Sanchez-Pescador et al (1988). In 
addition, the probes according to the present invention may have structural characteristics such that 

20 they allow the signal amplification, such structural characteristics being, for example, branched 
DNA probes as those described by Urdea et al. in 1991 or in the European patent No. EP 0 225 807 
(Chiron). 

A label can also be used to capture the primer, so as to facilitate the immobilization of either 
the primer or a primer extension product, such as amplified DNA, on a solid support. A capture 

25 label is attached to the primers or probes and can be a specific binding member which forms a 

binding pair with the solid's phase reagent's specific binding member (e.g. biotin and streptavidin). 
Therefore depending upon the type of label carried by a polynucleotide or a probe, it may be 
employed to capture or to detect the target DNA. Further, it will be understood that the 
polynucleotides, primers or probes provided herein, may, themselves, serve as the capture label. For 

30 example, in the case where a solid phase reagent's binding member is a nucleic acid sequence, it 
may be selected such that it binds a complementary portion of a primer or probe to thereby 
immobilize the primer or probe to the solid phase. In cases where a polynucleotide probe itself 
serves as the binding member, those skilled in the art will recognize that the probe will contain a 
sequence or 'tail" that is not complementary to the target. In the case where a polynucleotide primer 

35 itself serves as the capture label, at least a portion of the primer will be free to hybridize with a 
nucleic acid on a solid phase. DNA Labeling techniques are well known to the skilled technician. 
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The probes of the present invention are useful for a number of purposes. They can be 
notably used in Southern hybridization to genomic DNA or Northern hybridization to mRNA. The 
probes can also be used to detect PCR amplification products. They may also be used to detect 
mismatches in the OLFl to OLFIO genes or mRNA using other techniques. Generally, the probes 
5 are complementary to the OLFl to OLFIO gene coding sequences, although probes complementary 
to non-coding sequences are also contemplated. The probes of the present invention can also be 
useful for genotyping the biallelic markers of the cluster of olfactory receptor genes of the present 
invention. 

Any of the polynucleotides, primers and probes of the present invention can be conveniently 

10 immobilized on a solid support. Solid supports are known to those skilled in the art and include the 
walls of wells of a reaction tray, test tubes, polystyrene beads, magnetic beads, nitrocellulose strips, 
membranes, microparticles such as latex particles, sheep (or other animal) red blood cells, duracytes 
and others. The solid support is not critical and can be selected by one skilled in the art. Thus, latex 
particles, microparticles, magnetic or non-magnetic beads, membranes, plastic tubes, walls of 

15 microtiter wells, glass or silicon chips, sheep (or other suitable animal's) red blood cells and 

duracytes are all suitable examples. Suitable methods for immobilizing nucleic acids on solid phases 
include ionic, hydrophobic, covalent interactions and the like. A solid support, as used herein, refers 
to any material which is insoluble, or can be made insoluble by a subsequent reaction. The solid 
support can be chosen for its intrinsic ability to attract and immobilize the capture reagent. 

20 Alternatively, the solid phase can retain an additional receptor which has the ability to attract and 
immobilize the capture reagent. The additional receptor can include a charged substance that is 
oppositely charged with respect to the capture reagent itself or to a charged substance conjugated to 
the capture reagent. As yet another alternative, the receptor molecule can be any specific binding 
member which is immobilized upon (attached to) the solid support and which has the ability to 

25 immobilize the capture reagent through a specific binding reaction. The receptor molecule enables 
the indirect binding of the capture reagent to a solid support material before the performance of the 
assay or during the performance of the assay. The solid phase thus can be a plastic, derivatized 
plastic, magnetic or non-magnetic metal, glass or silicon surface of a test tube, microtiter well, sheet, 
bead, microparticle, chip, sheep (or other suitable animaPs) red blood cells, duracytesCe) and other 

30 configurations known to those of ordinary skill in the art. The polynucleotides of the invention can 
be attached to or immobilized on a solid support individually or in groups of at least 2, 5, 8, 10, 12, 
15, 20, or 25 distinct polynucleotides of the invention to a single solid support. In addition, 
polynucleotides other than those of the invention may be attached to the same solid support as one or 
more polynucleotides of the invention. 

35 Consequently, the invention also comprises a method for detecting the presence of a nucleic 

acid comprising a nucleotide sequence selected from a group consisting of SEQ ID Nos 1-1 1, a 
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fragment or a variant thereof and a complementary sequence thereto in a sample, said method 
comprising the following steps of: 

a) bringing into contact a nucleic acid probe or a plurality of nucleic acid probes which can 
hybridize with a nucleotide sequence selected from the group consisting of the nucleotide sequences 

5 of SEQ ID Nos 1 -11 , a fragment or a variant thereof and a complementary sequence thereto and the 
sample to be assayed; and 

b) detecting the hybrid complex formed between the probe and a nucleic acid in the sample. 
The invention further concerns a kit for detecting the presence of a nucleic acid comprising a 

nucleotide sequence selected from a group consisting of SEQ ID Nos 1-11, a fragment or a variant 
10 thereof and a complementary sequence thereto in a sample, said kit comprising: 

a) a nucleic acid probe or a plurality of nucleic acid probes which can hybridize with a 
nucleotide sequence selected from the group consisting of the nucleotide sequences of SEQ ID Nos 
1-11, a fragment or a variant thereof and a complementary sequence thereto; and 

b) optionally, the reagents necessary for performing the hybridization reaction. 

15 In a first preferred embodiment of this detection method and kit, said nucleic acid probe or 

the plurality of nucleic acid probes are labeled with a detectable molecule. In a second preferred 
embodiment of said method and kit, said nucleic acid probe or the plurality of nucleic acid probes 
has been immobilized on a substrate. In a third preferred embodiment, the nucleic acid probe or the 
plurality of nucleic acid probes comprise either a sequence which is selected from the group 

20 consisting of the nucleotide sequences of PI to PI 3 and the complementary sequence thereto, Bl to 
Bll, CI to CI 1, Dl to D13, El to E13 or a biallelic marker selected from the group consisting of Al 
to A13 and the complements thereto. 

Oligonucleotide arrays 

A substrate comprising a plurality of oligonucleotide primers or probes of the invention may 
25 be used either for detecting or amplifying targeted sequences in the olfactory receptor gene and may 
also be used for detecting mutations in the coding or in the non-coding sequences of the olfactory 
receptor gene. 

Any polynucleotide provided herein may be attached in overlapping areas or at random 
locations on the solid support. Alternatively the polynucleotides of the invention may be attached in 

30 an ordered array wherein each polynucleotide is attached to a distinct region of the solid support 
which does not overlap with the attachment site of any other polynucleotide. Preferably, such an 
ordered array of polynucleotides is designed to be "addressable" where the distinct locations are 
recorded and can be accessed as part of an assay procedure. Addressable polynucleotide arrays 
typically comprise a plurality of different oligonucleotide probes that are coupled to a surface of a 

35 substrate in different known locations. The knowledge of the precise location of each 

polynucleotides location makes these "addressable" arrays particularly useful in hybridization 
assays. Any addressable array technology known in the art can be employed with the 
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polynucleotides of the invention. One particular embodiment of these polynucleotide arrays is 
known as the Genechips'^'^, and has been generally described in US Patent 5,143,854; PCT 
pubhcations WO 90/15070 and 92/10092. These arrays may generally be produced using 
mechanical synthesis methods or light directed synthesis methods which mcorporate a combination 
5 of photolithographic methods and solid phase oligonucleotide synthesis (Fodor et al., 1991). The 
immobilization of arrays of oligonucleotides on solid supports has been rendered possible by the 
development of a technology generally identified as "Very Large Scale Immobilized Polymer 
Synthesis" (VLSIPS™) in which, typically, probes are immobilized in a high density array on a 
solid surface of a chip. Examples of VLSIPS™ technologies are provided in US Patents 5,143,854; 

10 and 5,412,087 and in PCT Publications WO 90/15070, WO 92/10092 and WO 95/1 1995, which 
describe methods for forming oligonucleotide arrays through techniques such as light-directed 
synthesis techniques. In designing strategies aimed at providing arrays of nucleotides immobilized 
on solid supports, further presentation strategies were developed to order and display the 
oligonucleotide arrays on the chips in an attempt to maximize hybridization patterns and sequence 

15 information. Examples of such presentation strategies are disclosed in PCT Publications WO 
94/12305, WO 94/1 1530, WO 97/29212 and WO 97/31256. 

In another embodiment of the oligonucleotide arrays of the invention, an oligonucleotide 
probe matrix may advantageously be used to detect mutations occurring in the olfactory receptor 
gene. For this particular purpose, probes are specifically designed to have a nucleotide sequence 

20 allowing their hybridization to the genes that carry known mutations (either by deletion, insertion or 
substitution of one or several nucleotides). By knovm mutations, it is meant, mutations on the 
olfactory receptor gene that have been identified according to, for example, the technique used by 
Huang et al.(1996) or Samson et aL(1996). 

Another technique that is used to detect mutations in the olfactory receptor gene is the use of 

25 a high-density DNA array. Each oligonucleotide probe constituting a unit element of the high 
density DNA array is designed to match a specific subsequence of the olfactory receptor genomic 
DNA or cDNA. Thus, an array consisting of oligonucleotides complementary to subsequences of 
the target gene sequence is used to determine the identity of the target sequence with the wild gene 
sequence, measure its amount, and detect differences between the target sequence and the reference 

30 wild gene sequence of the olfactory receptor gene. In one such design, termed 4L tiled array, is 
implemented a set of four probes (A, C, G, T), preferably 15-nucleotide oligomers. In each set of 
four probes, the perfect complement will hybridize more strongly than mismatched probes. 
Consequently, a nucleic acid target of length L is scanned for mutations with a tiled array containing 
4L probes, the whole probe set containing all the possible mutations in the known wild reference 

35 sequence. The hybridization signals of the 15-mer probe set tiled array are perturbed by a single 
base change in the target sequence. As a consequence, there is a characteristic loss of signal or a 
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"footprint" for the probes flanking a mutation position. This technique was described by Chee et al. 
in 1996. 

Consequently, the invention concerns an array of nucleic acid molecules comprising at least 
one polynucleotide described above as probes and primers. Preferably, the invention concerns an 
5 array of nucleic acid comprising at least two polynucleotides described above as probes and primers. 
A further object of the invention consists of an array of nucleic acid sequences comprising 
either at least one of the sequences selected from the group consisting of PI to P13, Bl to Bl 1, CI to 
CI 1, Dl to D13, El to E13, the sequences complementary thereto, a fragment thereof of at least 8, 
10, 12, 15, 18, 20, 25, 30, or 40 consecutive nucleotides thereof, and at least one sequence 
10 comprising a biallelic marker selected from the group consisting of Al to A13 and the complements 
thereto. 

The invention also pertains to an array of nucleic acid sequences comprising either at least 
two of the sequences selected from the group consisting of PI to PI 3, Bl to B 11, CI to CI 1, Dl to 
D13, El to El 3, the sequences complementary thereto, a fragment thereof of at least 8 consecutive 
15 nucleotides thereof, and at least two sequences comprising a biallelic marker selected from the group 
consisting of Al to A13 and the complements thereof. 

B. OLFl TO OFLIO PROTEINS AND POLYPEPTIDE FRAGMENTS 

The proteins encoded by the Open Reading Frames of the OLFl to OLFIO genes are listed 
individually in the sequence listing as SEQ ID Nos 12-21 . 

20 The term "olfactory receptor polypeptides" is used herein to embrace all of the proteins and 

polypeptides of the present invention. Also forming part of the invention are polypeptides encoded 
by the polynucleotides of the invention, as well as fusion polypeptides comprising such 
polypeptides. The invention embodies olfactory receptor proteins from humans, including isolated 
or purified olfactory receptor proteins consisting of, consisting essentially of, or comprising the 

25 sequences of SEQ ID Nos 12-21 or naturally-occurring variants or fragments thereof. 

The present invention embodies isolated, purified, and recombinant polypeptides comprising 
a contiguous span of at least 6 amino acids, preferably at least 8 or 10 amino acids, more preferably 
at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of SEQ ID Nos 12-21. In a preferred 
embodiment, the present invention embodies isolated, purified, and recombinant polypeptides 

30 comprising a contiguous span of at least 6 amino acids, preferably at least 8 or 10 amino acids, more 
preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of SEQ ID Nos 12-21 wherein said 
contiguous span includes at least 1, 2, 3, 5 or 10 of the following amino acid positions in SEQ ID 
Nos 12-21: 1-20,21-40,41-60, 61-80, 81-100, 101-120, 121-140, 141-160, 161-180, 181-200, 201- 
220, 221-240, 241-260, 261-280, 281-300, 301-the terminal amino acid of the olfactory receptor 

35 proteins, to the extent that such amino acid positions are consistent with the lengths of the particular 
olfactory receptor protein being referred to. In another preferred embodiment, the present invention 
embodies isolated, purified, and recombinant polypeptides comprising a contiguous span of at least 6 
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amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 
50, or 100 amino acids of a sequence selected from the group consisting of SEQ ID Nos 12, 14, 17, 
19 and 21 wherein said contiguous span includes at least 1, 2, 3, 5 or 10 of the following amino acid 
positions of said selected sequence: 1-10, 11-20, 21-30, 31-40, 41-50, 51-60, 61-70, 71-80, 81-90, 
5 91-100, 101-110, 111-120, 121-130, 131-140, 141-150, 151-160, 161-170, 171-180, 181-190, 191- 
the terminal ammo acid of the olfactory receptor proteins, to the extent that such amino acid 
positions are consistent with the lengths of the particular olfactory receptor protein being referred to. 
In further preferred embodiments, the present invention embodies isolated, purified, and 
recombinant polypeptides comprising a contiguous span of at least 6 amino acids, preferably at least 
10 8 or 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a 
sequence selected from the group consisting of SEQ ID Nos 12-21, wherein said contiguous span 
includes at least one amino acid at the following positions of said selected sequence 

i) 1-3, 10, 16, 21, 28, 33, 34, 36, 42-44, 46, 49, 53, 54, 57, 59, 63, and 64 for SEQ ID 
No 12; 

15 ii) 2, 4, 6, 8, 18, 25, 34, 37, 44, 52, 56, 80, 83, 89, 98, 101, 102, 1 13, 1 14, 117, 120, 

139, 148, 158, 186, 195, 212, 219, 247, 266, 270, 280, 295, 298, 299, 301, 3 1 1, and 

313-315 for SEQ ID No 13; 
iii) 2-4, 6, 18, 21, 25, 34, 37, 98, 99, 102, 113, 1 14, 133, 143, 148, 158-163, 166, 167, 

169, and 170 for SEQ ID No 14; 
20 iv) 2,4,6,8, 18,25,34,37,44,52,54,56,80,83,89,98, 101, 102, 113, 114, 117, 120, 

139, 148, 158, 186, 195,212,219, 247, 266, 270, 280, 298, 299, 31 1, and 313-315 

for SEQ ID No 15; 

v) 3, 18,20,25,34,47,49,67,97, 100, 107, 108, 112, 113, 126, 135, 142, 146, 147, 
157, 159-160, 194, 196, 228, 245, 264, 265, 269, 279, 298, and 302 for SEQ ID No 

25 16; 

vi) 2, 6, 18, 20, 33, 34, 37, 65, 68, 69, 72, 86, 88, 101, 107, 113, 1 14, 148, 158, 161, 
164, 195, and 198 for SEQ ID No 17; 

vii) 2, 6, 7, 52, 56, 67, 88, 94,97, 110, 113, 116, 119, 120, 127, 135, 150, 153, 164, 174, 
175, 180, 184, 217, 221, 259, 261, and 268 for SEQ ID No 18; 

30 viii) 17, 18, 20, 28, 33, 35, 49-52, 105, 1 11, and 1 12 for SEQ ID No 19; 

ix) 17, 20, 33, 35, 49-53, 56, 111, 1 12, 132, 138, 141, 147, 154, 157, 160, 163, 164, 
194, 197, 204, 211, 214, 218, 219, 252, 265, 286, 295, 301, 303, 305, 306 and 309 
for SEQ ED No 20; and 

x) 9, 1 8, 26-28, 34, 47 and 50 for SEQ ID No 2 1 , to the extent that such amino acid 
35 lengths are consistent with the lengths of the particular olfactory receptor protein 

being referred to. 
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Other preferred OLFl to OLFIO polypeptide fragments are those located outside the 
transmembrane domains, most preferably peptide fragments naturally exposed on the cell 
membrane, particularly those that are available for binding to ligand molecules, either odorant 
substances or molecules or antibodies directed to the olfactory receptor polypeptides of the 
5 invention. Such transmembrane domains TMl to TM7 are boxed in Figure 1 . In other preferred 
embodiments the contiguous stretch of amino acids comprises the site of a mutation or functional 
mutation, including a deletion, addition, swap or truncation of the amino acids in the olfactory 
receptor protein sequence. 

The invention also encompasses a purified, isolated, or recombinant polypeptides 
10 comprising an amino acid sequence having at least 70, 75, 80, 85, 90, 95, 98 or 99% amino acid 
identity with the amino acid sequence of SEQ ID Nos 12-21 or a fragment thereof. 

The invention also encompasses an olfactory receptor polypeptide or a fragment or a variant 
thereof in which at least one peptide bound has been modified as defined in the "Definitions" 
section. 

15 A further object of the invention concerns a purified or isolated polypeptide which is 

encoded by a nucleic acid comprising a nucleotide sequence selected from the group consisting of 
SEQ ID Nos 1-11 or fragment or variants thereof. 

Such mutated olfactory receptor proteins may be the target of diagnostic tools, such as 
specific monoclonal or polyclonal antibodies, useful for the detecting the mutated olfactory receptor 
20 proteins in a sample. 

Olfactory receptor proteins are preferably isolated from human or mammalian tissue samples 
or expressed from human or mammalian genes. 

The olfactory receptor polypeptides of the invention is extracted from cells or tissues of 
humans or non-human animals. Methods for purifying proteins are known in the art, and include the 
25 use of detergents or chaotropic agents to disrupt particles followed by differential extraction and 
separation of the polypeptides by ion exchange chromatography, affinity chromatography, 
sedimentation according to density, and gel electrophoresis. 

In addition, shorter protein fiagments may also be prepared by the conventional methods of 
chemical sjmthesis, either in a homogenous solution or in solid phase. As an illustrative embodiment 
30 of such chemical polypeptide synthesis techniques, it may be cited the homogenous solution 

technique described by Houbenweyl in 1974. For solid phase synthesis the technique described by 
Merrifield (1965) may be used in particular. 

Alternatively, the proteins of the invention can be made using routine expression methods 
known in the art as described below and in the section "Expression of a OLFl to OLFl 0 coding 
35 polynucleotide Briefly, the polynucleotide encoding the desired polypeptide, is ligated into an 
expression vector suitable for any convenient host. Both eukaryotic and prokaryotic host systems is 
used in forming recombinant polypeptides. The polypeptide is then isolated from lysed cells or from 
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the culture medium and purified to the extent needed for its intended use. Purification is by any 
technique known in the art, for example, differential extraction, salt fractionation, chromatography, 
centrifugation, and the like. See, for example. Methods in Enzymology for a variety of methods for 
purifying proteins. 

5 Any olfactory receptor cDNA, including SEQ ID Nos 12-21, may be used to express olfactory 

receptor proteins and polypeptides. The nucleic acid encoding the olfactory receptor protein or 
polypeptide to be expressed is operably linked to a promoter in an expression vector using conventional 
cloning technology. The olfactory receptor insert in the expression vector may comprise the full coding 
sequence for the olfactory receptor protein or a portion thereof For example, the olfactory receptor 
10 derived insert may encode a polypeptide comprising at least 1 0 consecutive amino acids of the olfactory 
receptor protein of SEQ ID Nos 12-21, including any of the polypeptide fragment defined in this 
section. 

The expression vector is any of the mammalian, yeast, insect or bacterial expression systems 
known in the art. Commercially available vectors and expression systems are available from a variety 

15 of suppliers including Genetics Institute (Cambridge, MA), Stratagene (La Jolla, California), Promega 
(Madison, Wisconsin), and Invitrogen (San Diego, California). If desired, to enhance expression and 
facilitate proper protein folding, the codon context and codon pairing of the sequence is optimized for 
the particular expression organism in which the expression vector is introduced, as explained by 
Hatfield, et al., U.S. Patent No. 5,082,767. 

20 In one embodiment, the entire coding sequence of the olfactory receptor cDNA through the 

poly A signal of the cDNA are operably linked to a promoter in the expression vector. Alternatively, if 
the nucleic acid encoding a portion of the olfactory receptor protein lacks a methionine to serve as the 
initiation site, an initiating methionine can be introduced next to the first codon of the nucleic acid using 
conventional techniques. Similarly, if the insert from the olfactory receptor cDNA lacks a poly A 

25 signal, this sequence can be added to the construct by, for example, splicing out the Poly A signal from 
pSG5 (Stratagene) using Bgll and Sail restriction endonuclease enzymes and incorporating it into the 
mammalian expression vector pXTl (Stratagene). 

The ligated product is transfected into mouse NIH 3T3 cells using Lipofectin (Life 
Technologies, Inc., Grand Island, New York) under conditions outlined in the product specification. 

30 Positive transfectants are selected after growing the transfected cells in 600ug/ml G41 8 (Sigma, St. 
Louis, Missouri). 

The above procedures may also be used to express a mutant olfactory receptor protein 
responsible for a detectable phenotype or a portion thereof 

Purification of the recombinant protein or peptide according to the present invention may be 
35 realized by passage onto a Nickel or Copper affinity chromatography column. The Nickel 

chromatography column may contain the Ni-NTA resin (Porath et al., 1975). The polypeptides or 
peptides thus obtained may be purified, for example by high performance liquid chromatography, 
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such as reverse phase and/or cationic exchange HPLC, as described by Rougeot et al. (1994). The 
reason to prefer this kind of peptide or protein purification is the lack of side products found in the 
elution samples which renders the resultant purified protein or peptide more suitable for a 
therapeutic use. 

5 The expressed protein may also be purified using other conventional purification techniques 

such as ammonium sulfate precipitation or chromatographic separation based on size or charge. The 
protein encoded by the nucleic acid insert may also be purified using standard immunochromatography 
techniques. In such procedures, polyclonal or monoclonal antibodies capable of specifically binding to 
the expressed olfactory receptor protein sof SEQ ID Nos 12-2 1 , or a fragment or a variant thereof, have 

10 been previously immobilized onto a chromatography matrix. Such antibodies are described in the 

section "Antibodies that bind olfactory receptor polypeptides" below. Then, a solution containing the 
expressed olfactory receptor protein or portion thereof, such as a cell extract, is applied to the 
chromatography column in conditions allowing the expressed protein to bind to the antibodies in the 
immunochromatography column. Thereafter, the column is washed to remove non-specifically bound 

15 proteins. The specifically bound expressed protein is then released from the column and recovered 
using standard techniques. 

If antibody production is not possible, the nucleic acids encoding the olfactory receptor protein 
or a portion thereof is incorporated into expression vectors designed for use in purification schemes 
employing chimeric polypeptides. In such strategies the nucleic acid encoding the olfactory receptor 

20 protein or a portion thereof is inserted in frame with the gene encoding the other half of the chimera. 
The other half of the chimera is p-globin or a nickel binding polypeptide encoding sequence. A 
chromatography matrix having antibody to P-globin or nickel attached thereto is then used to purify the 
chimeric protein. Protease cleavage sites is engineered between the P-globin gene or the nickel binding 
polypeptide and the olfactory receptor protein or portion thereof Thus, the two polypeptides of the 

25 chimera is separated fi-om one another by protease digestion. 

One useful expression vector for generating P-globin chimeric proteins is pSG5 (Stratagene), 
which encodes rabbit P-globin. Intron 11 of the rabbit P-globin gene facilitates splicing of the expressed 
transcript, and the polyadenylation signal incorporated into the construct increases the level of 
expression. These techniques are well known to those skilled in the art of molecular biology. Standard 

30 methods are published in methods texts such as Davis et al., ( 1 986) and many of the methods are 
available firom Stratagene, Life Technologies, Inc., or Promega. Polypeptide may additionally be 
produced from the construct using in vitro translation systems such as the In vitro Express™ Translation 
Kit (Stratagene). 

To confirm expression of the olfactory receptor protein or a portion thereof, the proteins 
35 expressed from host cells containing an expression vector containing an insert encoding the olfactory 
receptor protein or a portion thereof can be compared to the proteins expressed in host cells containing 
the expression vector without an insert. The presence of a band in samples from cells containing the 
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expression vector with an insert which is absent in samples from cells containing the expression vector 
without an insert indicates that the olfactory receptor protein or a portion thereof is being expressed. 
Generally, the band will have the mobility expected for the olfactory receptor protein or portion thereof 
However, the band may have a mobility different than that expected as a result of modifications such as 
5 glycosylation, ubiquitination, or enzymatic cleavage. 

Other suitable techniques for producing and purifying the olfactory receptor proteins of the 
invention or their fragments or variants are also described under the heading "Methods for 
scrreening substances or molecules interacting with an olfactory receptor protein". 

Thus, the present invention also concerns a method for the producing a polypeptide of the 
10 invention, and especially a polypeptide selected from the group of SEQ ID Nos 12-21 or a fragment 
or a variant thereof, wherein said methods comprises the steps of : 

a) culturing, in an appropriate culture medium, a cell host previously transformed or 
transfected with the recombinant vector comprising a nucleic acid encoding an olfactory receptor 
polypeptide of the invention, or a fragment or a variant thereof; 
15 b) harvesting the culture medium thus conditioned or lyze the cell host, for example by 

sonication or by an osmotic shock; 

c) separating or purifying, from the said culture medium, or from the pellet of the resultant 
host cell lysate the thus produced polypeptide of interest. 

d) optionally characterizing the produced polypeptide of interest. 

20 In a specific embodiment of the above method, step a) is preceded by a step wherein the 

nucleic acid coding for an olfactory receptor polypeptide, or a fragment or a variant thereof, is 
inserted in an appropriate vector, optionally after an appropriate cleavage of this amplified nucleic 
acid with one or several restriction endonucleases. The nucleic acid coding for an olfactory receptor 
polypeptide or a fragment or a variant thereof may be the resulting product of an amplification 

25 reaction using a pair of primers according to the invention (by PGR, SDA, TAS, 3SR NASBA, TMA 
etc.). 

C. ANTIBODIES THAT BIND OLFACTORY RECEPTOR POLYPEPTIDES 

Any olfactory receptor polj^eptide or whole protein may be used to generate antibodies 
capable of specifically binding to an expressed olfactory receptor protein or fragments thereof as 
30 described. 

One antibody composition of the invention is capable of specifically binding or specifically 
bind to the variant of the olfactory receptor protein of SEQ ID Nos 12-21. For an antibody 
composition to specifically bind to a first variant of olfactory receptor protein, it must demonstrate at 
least a 5%, 10%, 15%, 20%, 25%, 50%, or 100% greater binding affinity for a first variant of the 
35 olfactory receptor protein than for a second variant of the olfactory receptor protein in an ELISA, 
RIA, or other antibody-based binding assay. 
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In a preferred embodiment, the invention concerns antibody compositions, either polyclonal 
or monoclonal, capable of selectively binding, or that selectively bind to an epitope-containing a 
polypeptide comprising any of the fragments described in the section "OLFl to OLFIO proteins and 
polypeptide fragments". Preferred peptide fragments are portions of OLFl to OLFIO polypeptides 
5 that are located outside the transmembrane domains, most preferably peptide fragments naturally 
exposed on the cell membrane, particularly those that are available for binding to ligand molecules, 
either odorant substances or molecules or antibodies directed to the olfactory receptor polypeptides 
of the invention. 

The invention also concerns a purified or isolated antibody capable of specifically binding to 

10 a mutated olfactory receptor protein or to a fragment or variant thereof comprising an epitope of the 
mutated olfactory receptor protein. In another preferred embodiment, the present invention concerns 
an antibody capable of binding to a polypeptide comprising at least 10 consecutive amino acids of an 
olfactory receptor protein. 

In a preferred embodiment, the invention concerns the use in the manufacture of antibodies 

15 of a polypeptide comprising any of the fragments described in the section "OLFl to OLFIO proteins 
and polypeptide fragments". Preferred peptide fragments are portions of OLFl to OLFIO 
polypeptides that are located outside the transmembrane domains, most preferably peptide fragments 
naturally exposed on the cell membrane, particularly those that are available for recognition of 
ligand molecules, either odorant substances or molecules or antibodies directed to the olfactory 

20 receptor polypeptides of the invention. 

The olfactory receptor expressed from a DNA comprising at least one of the nucleic 
sequences of SEQ ID Nos 1-11 or a fragment or a variant thereof may also be used to generate 
antibodies capable of specifically binding to the expressed olfactory receptor or fragments or 
variants thereof In a preferred embodiment, any of the polynucleotide fragment encoding a 

25 polypeptide described in the section " Coding regions of the olfactory receptor gene" may be used to 
generate such antibodies. 

Substantially pure protein or polypeptide is isolated from transfected or transformed cells 
containing an expression vector encoding the olfactory receptor protein or a portion thereof. The 
concentration of protein in the final preparation is adjusted, for example, by concentration on an 

30 Amicon filter device, to the level of a few micrograms/ml. Monoclonal or polyclonal antibodies to the 
protein can then be prepared as follows: 

1. Monoclonal Antibody Production by Hybridoma Fusion 

Monoclonal antibody to epitopes in the olfactory receptor of the present invention or a portion 
thereof can be prepared from murine hybridomas according to the classical method of Kohler and 
35 Milstein, (1975) or derivative methods thereof. Briefly, a mouse is repetitively inoculated with a few 
micrograms of the considered olfactory receptor or a portion thereof over a period of a few weeks. The 
mouse is then sacrificed, and the antibody producing cells of the spleen isolated. The spleen cells are 
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fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells 
destroyed by growth of the system on selective media comprising aminoptenn (HAT media). The 
successfully fused cells are diluted and aliquots of the dilution placed in wells of a microtiter plate 
where growth of the culture is continued. Antibody-producing clones are identified by detection of 
5 antibody in the supernatant fluid of the wells by immunoassay procedures, such as ELISA, as originally 
described by Engvall, (1 980), and derivative methods thereof. Selected positive clones can be expanded 
and their monoclonal antibody product harvested for use. Detailed procedures for monoclonal antibody 
production are described in Davis, L. et al. 

2. Polyclonal Antibody Production by Immunization 

10 Polyclonal antiserum containing antibodies to heterogeneous epitopes in the olfactory receptor 

of the present invention or a portion thereof can be prepared by immunizing suitable animals with the 
considered olfactory receptor or a portion thereof, which can be unmodified or modified to enhance 
immunogenicity. A suitable non-human animal, preferably a non-human mammal, is selected, 
usually a mouse, rat, rabbit, goat, or horse. Alternatively, a crude preparation which has been 

15 enriched for olfactory receptor concentration can be used to generate antibodies. Such proteins, 
fragments or preparations are introduced into the non-human mammal in the presence of an 
appropriate adjuvant (e.g. aluminum hydroxide, RIBI, etc.) which is known in the art. In addition 
the protein, fragment or preparation can be pretreated with an agent which will increase antigenicity, 
such agents are known in the art and include, for example, methylated bovine serum albumin 

20 (mBSA), bovine serum albumin (BSA), Hepatitis B surface antigen, and keyhole limpet hemocyanin 
(KLH). Serum from the immunized animal is collected, treated and tested according to known 
procedures. If the serum contains polyclonal antibodies to undesired epitopes, the polyclonal 
antibodies can be purified by immunoaffinity chromatography. 

Effective polyclonal antibody production is affected by many factors related both to the antigen 

25 and the host species. Also, host animals vary in response to site of inoculations and dose, with both 

inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigen 
administered at multiple intradermal sites appears to be most reliable. Techniques for producing and 
processing polyclonal antisera are known in the art, see for example, Mayer and Walker (1987). An 
effective immunization protocol for rabbits can be found in Vaitukaitis, J. et al. (1971). 

30 Booster injections can be given at regular intervals, and antiserum harvested when antibody titer 

thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar against 
known concentrations of the antigen, begins to fall. See, for example, Ouchterlony et al., (1973). 
Plateau concentration of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum. Affinity of the 
antisera for the antigen is determined by preparing competitive binding curves, as described, for 

35 exan^le, by Fisher, (1980). 

Antibody preparations prepared according to either the monoclonal or the polyclonal protocol 
are useful in quantitative immunoassays which determine concentrations of antigen-bearing substances 
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in biological samples; they are also used semi-quantitatively or qualitatively to identify the presence of 
antigen in a biological sample. The antibodies may also be used in therapeutic compositions for killing 
cells expressing the protein or reducing the levels of the protein in the body. 

Non-human animals or mammals, v^hether wild-type or transgenic, v^hich express a different 
5 species of olfactory receptor than the one to v^hich antibody binding is desired, and animals which 
do not express olfactory receptor (i.e. an olfactory receptor knock out animal as described herein) are 
particularly useful for preparing antibodies. Olfactory receptor knock out animals will recognize all 
or most of the exposed regions of an olfactory receptor protein as foreign antigens, and therefore 
produce antibodies with a wider array of olfactory receptor epitopes. Moreover, smaller 

10 polypeptides with only 10 to 30 amino acids may be useful in obtaining specific binding to any one 
of the olfactory receptor proteins. In addition, the humoral immune system of animals which 
produce a species of olfactory receptor that resembles the antigenic sequence will preferentially 
recognize the differences between the animal's native olfactory receptor species and the antigen 
sequence, and produce antibodies to these unique sites in the antigen sequence. Such a technique 

15 will be particularly useful in obtaining antibodies that specifically bind to any one of the olfactory 
receptor proteins. 

The present invention also includes, chimeric single chain Fv antibody fragments (Martineau et 
al., 1998), antibody fragments obtained through phage display libraries (Ridder et ah, 1995; Vaughan et 
aL, 1995) and humanized antibodies (Reinmann et aL, 1997: Leger et al., 1997). 
20 The antibodies of the invention may be labeled by any one of the radioactive, fluorescent or 

enzymatic labels knovm in the art. 

Consequently, the invention is also directed to a method for detecting specifically the 
presence of a polypeptide according to the invention in a biological sample, said method comprising 
the following steps : 

25 a) bringing into contact the biological sample with an antibody according to the 

invention; 

b) detecting the antigen-antibody complex formed. 
Is also part of the invention a diagnostic kit for in vitro detecting the presence of a 
polypeptide according to the present invention in a biological sample, wherein said kit comprises: 
30 a) a polyclonal or monoclonal antibody as described above, optionally labeled; 

b) a reagent allowing the detection of the antigen-antibody complexes formed, said 
reagent carrying optionally a label, or being able to be recognized itself by a labeled reagent, 
more particularly in the case when the above-mentioned monoclonal or polyclonal antibody 
is not labeled by itself. 

35 D. OLFACTORY RECEPTOR-RELATED BIALLELIC MARKERS 

The invention also concerns olfactory receptor-related biallelic markers. As used herein the 
term "olfactory receptor-related biallelic marker" relates to a set of biallelic markers in linkage 



wo 00/21985 PCT/IB99/01729 

40 

disequilibrium with the olfactory receptor gene. The term olfactory receptor-related biallelic marker 
includes the biallelic markers designated Al to A13. 

The biallelic markers of the present invention, namely Al to A13, are disclosed in Table 2 of 
Example 4. The 13 olfactory receptor-related biallelic markers, Al to A 13, are all located in the 
5 genomic non coding regions of the olfactory gene cluster of the invention. Their precise location on 
the olfactory receptor genomic sequence and their single base polymorphism are indicated in Table 2 
and also as features in the sequence listing for SEQ ID No 1 . Appropriate pairs of primers allowing 
the amplification of a nucleic acid containing the polymorphic base of the disclosed olfactory 
receptor biallelic marker are also Hsted in Table 1 of Example 3 and in features of SEQ ED No 1. 
10 In the present invention, the biallelic markers can be defined by nucleotide sequences 

corresponding to oligonucleotides of 47 bases in length comprising at the middle one of the 
polymorphic base. More particularly, the biallehc markers can be defined by the polynucleotides PI 
toP13. 

The biallelic markers contained in the olfactory gene cluster of the present invention, or a 
15 busset of such biallelic markers, are useful tools to perform association studies, preferably to 
perform association studies between the statistically significant occurrence of an allele of said 
biallelic marker in the genome of an individual and a specific phenotype, including a phenotype 
consisting of an alteration of the olfactory perception of odorant substances or molecules by said 
individual. The biallelic markers of the invention can also be used, for example, in linkage analysis 
20 in which evidence is sought for cosegregation between a locus and a putative trait locus using family 
studies, such as an alteration of olfactory perception. In addition, the biallellic markers of the 
invention may be included inthe generation of any complete or partial genetic map of the human 
genome. These different uses are specifically contemplated in the present invention and claims. 

1. Identification of biallelic markers 

25 Any of a variety of methods can be used to screen a genomic fi-agment for single nucleotide 

polymorphisms such as differential hybridization with oligonucleotide probes, detection of changes 
in the mobility measured by gel electrophoresis or direct sequencing of the amplified nucleic acid. 
A preferred method for identifying biallelic markers involves comparative sequencing of genomic 
DNA fragments from an appropriate nimiber of unrelated individuals. 

30 In a first embodiment, DNA samples fi-om unrelated individuals are pooled together, 

following which the genomic DNA of interest is amplified and sequenced. The nucleotide 
sequences thus obtained are then analyzed to identify significant polymorphisms. One of the major 
advantages of this method resides in the fact that the pooling of the DNA samples substantially 
reduces the number of DNA amplification reactions and sequencing reactions, which must be carried 

35 out. Moreover, this method is sufficiently sensitive so that a biallelic marker obtained thereby 

usually shows a sufficient degree of informativeness to be useful in conducting association studies. 
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In a second embodiment, the DNA samples are not pooled and are therefore amplified and 
sequenced individually. This method is usually preferred when biallelic markers need to be 
identified in order to perform association studies within candidate genes. Preferably, highly relevant 
gene regions such as promoter regions or exon regions may be screened for biallelic markers. A 
5 biallelic marker obtained using this method may show a lower degree of informativeness for 

conducting association studies, e.g. if the frequency of its less frequent allele may be less than about 
10%. Such a biallelic marker will, however, be sufficiently informative to conduct association 
studies and it will further be appreciated that including less informative biallelic markers in the 
genetic analysis studies of the present invention, may allow in some cases the direct identification of 
10 causal mutations, which may, depending on their penetrance, be rare mutations. 

The following is a description of the various parameters of a preferred method used by the 
inventors for the identification of the biallelic markers of the present invention. 

Genomic DNA Samples 

The genomic DNA samples from which the biallelic markers of the present invention are 

15 generated are preferably obtained from unrelated individuals corresponding to a heterogeneous 
population of known ethnic background. The number of individuals from whom DNA samples are 
obtained can vary substantially, preferably from about 10 to about 1000, preferably from about 50 to 
about 200 individuals. It is usually preferred to collect DNA samples from at least about 100 
individuals in order to have sufficient polymorphic diversity in a given population to identify as 

20 many markers as possible and to generate statistically significant results. 

As for the source of the genomic DNA to be subjected to analysis, any test sample can be 
foreseen without any particular limitation. These test samples include biological samples, which can 
be tested by the methods of the present invention described herein, and include human and animal 
body fluids such as whole blood, serum, plasma, cerebrospinal fluid, urine, lymph fluids, and 

25 various external secretions of the respiratory, intestinal and genitourinary tracts, tears, saliva, milk, 
white blood cells, myelomas and the like; biological fluids such as cell culture supematants; fixed 
tissue specimens including tumor and non-tumor tissue and lymph node tissues; bone marrow 
aspirates and fixed cell specimens. The preferred source of genomic DNA used in the present 
invention is from peripheral venous blood of each donor. Techniques to prepare genomic DNA 

30 fi*om biological samples are well known to the skilled technician. Details of a preferred embodiment 
are provided in Example 2. The person skilled in the art can choose to amplify pooled or unpooled 
DNA samples. 

DNA Amplification 

The identification of biallelic markers in a sample of genomic DNA may be facilitated 
35 through the use of DNA amplification methods. DNA samples can be pooled or unpooled for the 
amplification step. DNA amplification techniques are well known to those skilled in the art. 
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Amplification techniques that can be used in the context of the present invention include, but 
are not limited to, the ligase chain reaction (LCR) described in EP-A- 320 308, WO 9320227 and 
EP-A-439 1 82, the polymerase chain reaction (PCR, RT-PCR) and techniques such as the nucleic 
acid sequence based amplification (NASBA) described in Guatelli J.C., et al.(1990) and in Compton 
5 J.(1991), Q-beta amplification as described in European Patent Application No 4544610, strand 
displacement amplification as described in Walker et al.(1996) and EP A 684 315 and, target 
mediated amplification as described in PCT Publication WO 9322461. 

LCR and Gap LCR are exponential amplification techniques, both depend on DNA ligase to 
join adjacent primers annealed to a DNA molecule. In Ligase Chain Reaction (LCR), probe pairs 

10 are used which include two primary (first and second) and two secondary (third and fourth) probes, 
all of which are employed in molar excess to target. The first probe hybridizes to a first segment of 
the target strand and the second probe hybridizes to a second segment of the target strand, the first 
and second segments being contiguous so that the primary probes abut one another in 5' phosphate- 
3 'hydroxy! relationship, and so that a ligase can covaiently fuse or ligate the two probes into a fused 

15 product. In addition, a third (secondary) probe can hybridize to a portion of the first probe and a 
fourth (secondary) probe can hybridize to a portion of the second probe in a similar abutting fashion. 
Of course, if the target is initially double stranded, the secondary probes also will hybridize to the 
target complement in the first instance. Once the ligated strand of primary probes is separated from 
the target strand, it will hybridize with the third and fourth probes, which can be ligated to form a 

20 complementary, secondary ligated product. It is important to realize that the ligated products are 
functionally equivalent to either the target or its complement. By repeated cycles of hybridization 
and ligation, amplification of the target sequence is achieved. A method for multiplex LCR has also 
been described (WO 9320227). Gap LCR (GLCR) is a version of LCR where the probes are not 
adjacent but are separated by 2 to 3 bases. 

25 For amplification of mRNAs, it is v^thin the scope of the present invention to reverse 

transcribe mRNA into cDNA followed by polymerase chain reaction (RT-PCR); or, to use a single 
enzyme for both steps as described in U.S. Patent No. 5,322,770 or, to use Asymmetric Gap LCR 
(RT-AGLCR) as described by Marshall et al.(1994). AGLCR is a modification of GLCR that 
allows the amplification of RNA. 

30 The PCR technology is the preferred amplification technique used in the present invention. 

A variety of PCR techniques are familiar to those skilled in the art. For a review of PCR technology, 
see White (1997) and the publication entitled "PCR Methods and Applications" (1991, Cold Spring 
Harbor Laboratory Press). In each of these PCR procedures, PCR primers on either side of the 
nucleic acid sequences to be amplified are added to a suitably prepared nucleic acid sample along 

35 with dNTPs and a thermostable polymerase such as Taq polymerase, Pfu polymerase, or Vent 
pol3mierase. The nucleic acid in the sample is denatured and the PCR primers are specifically 
hybridized to complementary nucleic acid sequences in the sample. The hybridized primers are 
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extended. Thereafter, another cycle of denaturation, hybridization, and extension is initiated. The 
cycles are repeated multiple times to produce an amplified fragment containing the nucleic acid 
sequence between the primer sites. PCR has further been described in several patents including US 
Patents 4,683,195; 4,683,202; and 4,965,188. 
5 The PCR technology is the preferred amplification technique used to identify new biallelic 

markers. A typical example of a PCR reaction suitable for the purposes of the present invention is 
provided in Example 3. 

One of the aspects of the present invention is a method for the amplification of the human 
olfactory receptor gene, particularly of a fragment of the genomic sequence of SEQ ID No 1 or of 
10 the coding region sequences of SEQ ID Nos 2-1 1, or a fragment or a variant thereof in a test sample, 
preferably using the PCR technology. This method comprises the steps of: 

a) contacting a test sample with amplification reaction reagents comprising a pair of 

amplification primers as described above and located on either side of the polynucleotide 
region to be amplified, and 
15 b) optionally, detecting the amplification products. 

The invention also concerns a kit for the amplification of an olfactory receptor gene sequence, 
particularly of a portion of the genomic sequence of SEQ ID No 1 or of the coding region sequences 
of SEQ ID Nos 2-1 1 , or a variant thereof in a test sample, wherein said kit comprises: 

a) a pair of oligonucleotide primers located on either side of the olfactory receptor region to 
20 be amplified; 

b) optionally, the reagents necessary for performing the amplification reaction. 

In one embodiment of the above amplification method and kit, the amplification product is 
detected by hybridization with a labeled probe having a sequence which is complementary to the 
amplified region. In another embodiment of the above amplification method and kit, primers 

25 comprise a sequence which is selected from the group consisting of the nucleotide sequences of Bl 
to Bl 1, CI to CI 1, Dl to D13, and El to E13. 

In a first embodiment of the present invention, biallelic markers are identified using genomic 
sequence information generated by the inventors. Sequenced genomic DNA fragments are used to 
design primers for the amplification of 500 bp fragments. These 500 bp fragments are amplified 

30 from genomic DNA and are scanned for biallelic markers. Primers may be designed using the OSP 
software (Hillier L. and Green P., 1991). All primers may contain, upstream of the specific target 
bases, a common oligonucleotide tail that serves as a sequencing primer. Those skilled in the art are 
familiar with primer extensions, which can be used for these purposes. 

Sequencing Of Amplified Genomic DNA And Identification Of Single Nucleotide Polymorphisms 
35 The amplification products generated as described above, are then sequenced using any 

method known and available to the skilled technician. Methods for sequencing DNA using either 
the dideoxy-mediated method (Sanger method) or the Maxam-Gilbert method are widely known to 



wo 00/21985 PCT/IB99/01729 

44 

those of ordinary skill in the art. Such methods are for example disclosed in Sambrook et al.(1989). 
Alternative approaches include hybridization to high-density DNA probe arrays as described in Chee 
et aL(1996). 

Preferably, the amplified DNA is subjected to automated dideoxy terminator sequencing 
5 reactions using a dye-primer cycle sequencing protocol. Following gel image analysis and DNA 
sequence extraction, sequence data are automatically processed with adequate software to assess 
sequence quality. 

A polymorphism analysis software is used that detects the presence of biallelic sites among 
individual or pooled amplified fragment sequences. Polymorphism search is based on the presence 
10 of superimposed peaks in the electrophoresis pattern. These peaks which present distinct colors 

correspond to two different nucleotides at the same position on the sequence. The polymorphism has 
to be detected on both strands for validation. 

Validation Of The Biallelic Markers Of The Present Invention 

The polymorphisms are evaluated for their usefulness as genetic markers by validating that 

15 both alleles are present in a population. Validation of the biallelic markers is accomplished by 
genot3T5ing a group of individuals by a method of the invention and demonstrating that both alleles 
are present. Microsequencing is a preferred method of genotyping alleles. The validation by 
genotyping step may be performed on individual samples derived from each individual in the group 
or by genotyping a pooled sample derived from more than one individual. The group can be as 

20 small as one individual if that individual is heterozygous for the allele in question. Preferably the 
group contains at least three individuals, more preferably the group contains five or six individuals, 
so that a single validation test will be more likely to result in the validation of more of the biallelic 
markers that are being tested. It should be noted, however, that when the validation test is 
performed on a small group it may result in a false negative result if as a result of sampling error 

25 none of the individuals tested carries one of the two alleles. Thus, the validation process is less 
useful in demonstrating that a particular initial result is an artifact, than it is at demonstrating that 
there is a bona fide biallehc marker at a particular position in a sequence. All of the genotyping, 
haplotyping, association, and interaction study methods of the invention may optionally be 
performed solely with validated biallelic markers. 

30 2. Genotyping of biallelic markers 

The polymorphisms identified above can be further confirmed and their respective 
frequencies can be determined through various methods using the previously described primers and 
probes. These methods can also be useful for genotyping either new populations in association 
studies or individuals in the context of detection of alleles of biallelic markers which are known to 
35 be associated with a given trait. Those skilled in the art should note that the methods described 
below can be equally performed on individual or pooled DNA samples. 
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Once a given polymorphic site has been found and characterized as a biallehc marker as 
described above, several methods can be used in order to determine the specific allele carried by an 
individual at the given polymorphic base. 

The identification of biallelic markers described previously allows the design of appropriate 
5 primers to amplify a region of the olfactory receptor gene cluster containing the polymorphic site of 
interest and for the detection of such polymorphisms, 

Genotyping can be performed using similar methods as those described above for the 
identification of the biallelic markers, or using other genotyping methods such as those further 
described below. In preferred embodiments, the comparison of sequences of amplified genomic 
10 fragments from different individuals is used to identify new biallelic markers whereas 

microsequencing is used for genotyping known biallelic markers in diagnostic and genetic analysis 
applications. 

In one embodiment the invention encompasses methods of genotyping comprising 
determining the identity of a nucleotide at an olfactory receptor-related biallelic marker or the 

15 complement thereof in a biological sample; optionally, wherein said olfactory receptor-related 
biallelic marker is selected from the group consisting of Al to A13, and the complements thereof; 
optionally, wherein said biological sample is derived from a single subject; optionally, wherein the 
identity of the nucleotides at said biallelic marker is determined for both copies of said biallelic 
marker present in said individual's genome; optionally, wherein said biological sample is derived 

20 from multiple subjects; Optionally, the genotyping methods of the invention encompass methods 
with any further limitation described in this disclosure, or those following, specified alone or in any 
combination; Optionally, said method is performed in vitro\ optionally, further comprising 
amplifying a portion of said sequence comprising the biallelic marker prior to said determining step; 
Optionally, wherein said amplifying is performed by PCR, LCR, or replication of a recombinant 

25 vector comprising an origin of replication and said fi-agment in a host cell; optionally, wherein said 
determining is performed by a hybridization assay, a sequencing assay, a microsequencing assay, or 
an enzyme-based mismatch detection assay. 

Source of Nucleic Acids for genotyping 

Any source of nucleic acids, in purified or non-purified form, can be utilized as the starting 
30 nucleic acid, provided it contains or is suspected of containing the specific nucleic acid sequence 
desired. DNA or RNA may be extracted from cells, tissues, body fluids and the like as described 
above. While nucleic acids for use in the genotyping methods of the invention can be derived from 
any mammalian source, the test subjects and individuals fi-om which nucleic acid samples are taken 
are generally understood to be human. 
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Amplification Of DNA Fragments Comprising Biallelic Markers 

Methods and polynucleotides are provided to amplify a segment of nucleotides comprising 
one or more biallelic marker of the present invention. It will be appreciated that amplification of 
DNA fragments comprising biallelic markers may be used in various methods and for various 
5 purposes and is not restricted to genotyping. Nevertheless, many genotyping methods, although not 
all, require the previous amplification of the DNA region carrying the biallelic marker of interest. 
Such methods specifically increase the concentration or total number of sequences that span the 
biallelic marker or include that site and sequences located either distal or proximal to it. Diagnostic 
assays may also rely on amplification of DNA segments carrying a biallelic marker of the present 

10 invention. AmpHfication of DNA may be achieved by any method known in the art. Amplification 
techniques are described above in the section entitled, "DNA amplification." 

Some of these amplification methods are particularly suited for the detection of single 
nucleotide polymorphisms and allow the simultaneous amplification of a target sequence and the 
identification of the polymorphic nucleotide as it is further described below. 

15 The identification of biallelic markers as described above allows the design of appropriate 

oligonucleotides, which can be used as primers to amplify DNA fragments comprising the biallelic 
markers of the present invention. Amplification can be performed using the primers initially used to 
discover new biallelic markers which are described herein or any set of primers allowing the 
amplification of a DNA fragment comprising a biallelic marker of the present invention. 

20 In some embodiments the present invention provides primers for amplifying a DNA 

fragment containing one or more biallelic markers of the present invention. Preferred amplification 
primers are listed in Example 3. It will be appreciated that the primers listed are merely exemplary 
and that any other set of primers which produce amplification products containing one or more 
biallelic markers of the present invention are also of use, 

25 The spacing of the primers determines the length of the segment to be amplified. In the 

context of the present invention, amplified segments carrying biallelic markers can range in size 
from at least about 25 bp to 35 kbp. Amplification fragments from 25-3000 bp are typical, 
fragments from 50-1000 bp are preferred and fragments from 100-600 bp are highly preferred. It 
will be appreciated that amplification primers for the biallelic markers may be any sequence which 

30 allow the specific amplification of any DNA fragment carrying the markers. Amplification primers 
may be labeled or immobilized on a solid support as described in "Oligonucleotide probes and 
primers". 

Methods of Genotvping DNA samples for Biallelic Markers 

Any method known in the art can be used to identify the nucleotide present at a biallehc 
35 marker site. Since the biallelic marker allele to be detected has been identified and specified in the 
present invention, detection will prove simple for one of ordinary skill in the art by employing any 
of a number of techniques. Many genotyping methods require the previous amplification of the 
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DNA region carrying the biallelic marker of interest. While the amplification of target or signal is 
often preferred at present, ultrasensitive detection methods which do not require amplification are 
also encompassed by the present genotyping methods. Methods well-known to those skilled in the 
art that can be used to detect biallelic polymorphisms include methods such as, conventional dot blot 
5 analyzes, single strand conformational polymorphism analysis (SSCP) described by Orita et 
al.(1989), denaturing gradient gel electrophoresis (DGGE), heteroduplex analysis, mismatch 
cleavage detection, and other conventional techniques as described in Sheffield et al.(1991), White 
et al.(1992), Grompe et al.(1989 and 1993). Another method for determining the identity of the 
nucleotide present at a particular polymorphic site employs a specialized exonuclease-resistant 

1 0 nucleotide derivative as described in US patent 4,656,127. 

Preferred methods involve directly determining the identity of the nucleotide present at a 
biallelic marker site by sequencing assay, enzyme-based mismatch detection assay, or hybridization 
assay. The following is a description of some preferred methods. A highly preferred method is the 
microsequencing technique. The term "sequencing" is generally used herein to refer to polymerase 

15 extension of duplex primer/template complexes and includes both traditional sequencing and 
microsequencing. 
1) Sequencing Assays 

The nucleotide present at a polymorphic site can be determined by sequencing methods. In 
a preferred embodiment, DNA samples are subjected to PCR amplification before sequencing as 

20 described above. DNA sequencing methods are described in "Sequencing Of Amplified Genomic 
DNA And Identification Of Single Nucleotide Polymorphisms". 

Preferably, the amplified DNA is subjected to automated dideoxy terminator sequencing 
reactions using a dye-primer cycle sequencing protocol. Sequence analysis allows the identification 
of the base present at the biallelic marker site. 

25 2) Microsequencing Assays 

In microsequencing methods, the nucleotide at a polymorphic site in a target DNA is 
detected by a single nucleotide primer extension reaction. This method involves appropriate 
microsequencing primers which, hybridize just upstream of the polymorphic base of interest in the 
target nucleic acid. A polymerase is used to specifically extend the 3' end of the primer with one 

30 single ddNTP (chain terminator) complementary to the nucleotide at the polymorphic site. Next the 
identity of the incorporated nucleotide is determined in any suitable way. 

Typically, microsequencing reactions are carried out using fluorescent ddNTPs and the 
extended microsequencing primers are analyzed by electrophoresis on ABI 377 sequencing 
machines to determine the identity of the incorporated nucleotide as described in EP 412 883. 

35 Alternatively capillary electrophoresis can be used in order to process a higher number of assays 
simultaneously. An example of a tjnpical microsequencing procedure that can be used in the context 
of the present invention is provided in Example 5. 
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Different approaches can be used for the labehng and detection of ddNTPs. A homogeneous 
phase detection method based on fluorescence resonance energy transfer has been described by Chen 
and Kwok (1997) and Chen et al.(1997). In this method, ampHfied genomic DNA fragments 
containing polymorphic sites are incubated with a 5'-fluorescein-labeled primer in the presence of 
5 allehc dye-labeled dideoxyribonucleoside triphosphates and a modified Taq polymerase. The dye- 
labeled primer is extended one base by the dye-terminator specific for the allele present on the 
template. At the end of the genotyping reaction, the fluorescence intensities of the two dyes in the 
reaction mixture are analyzed directly without separation or purification. All these steps can be 
performed in the same tube and the fluorescence changes can be monitored in real time. 
10 Alternatively, the extended primer may be analyzed by MALDI-TOF Mass Spectrometry. The base 
at the polymorphic site is identified by the mass added onto the microsequencing primer (see Haff 
and Smimov, 1997). 

Microsequencing may be achieved by the established microsequencing method or by 
developments or derivatives thereof. Alternative methods include several solid-phase 

15 microsequencing techniques. The basic microsequencing protocol is the same as described 

previously, except that the method is conducted as a heterogeneous phase assay, in which the primer 
or the target molecule is immobilized or captured onto a solid support. To simplify the primer 
separation and the terminal nucleotide addition analysis, oligonucleotides are attached to solid 
supports or are modified in such ways that permit affinity separation as well as polymerase 

20 extension. The 5' ends and internal nucleotides of synthetic oligonucleotides can be modified in a 
number of different ways to permit different affinity separation approaches, e.g., biotinylation. If a 
single affinity group is used on the oligonucleotides, the oligonucleotides can be separated from the 
incorporated terminator regent. This eliminates the need of physical or size separation. More than 
one oligonucleotide can be separated from the terminator reagent and analyzed simultaneously if 

25 more than one affinity group is used. This permits the analysis of several nucleic acid species or 
more nucleic acid sequence information per extension reaction. The affinity group need not be on 
the priming oligonucleotide but could alternatively be present on the template. For example, 
immobilization can be carried out via an interaction between biotinylated DNA and streptavidin- 
coated microtitration wells or avidin-coated polystjrene particles. In the same manner, 

30 oligonucleotides or templates may be attached to a solid support in a high-density format. In such 
solid phase microsequencing reactions, incorporated ddNTPs can be radiolabeled (Syvanen, 1994) 
or linked to fluorescein (Livak and Hainer, 1994). The detection of radiolabeled ddNTPs can be 
achieved through scintillation-based techniques. The detection of fluorescein-linked ddNTPs can be 
based on the binding of antifluorescein antibody conjugated with alkaline phosphatase, followed by 

35 incubation with a chromogenic substrate (such as /^-nitrophenyl phosphate). Other possible reporter- 
detection pairs include: ddNTP linked to dinitrophenyl (DNP) and anti-DNP alkaline phosphatase 
conjugate (Haiju et al., 1993) or biotinylated ddNTP and horseradish peroxidase-conjugated 
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streptavidin with o-phenylenediamine as a substrate (WO 92/15712). As yet another alternative 
solid-phase microsequencing procedure, Nyren et al.(1993) described a method relying on the 
detection of DNA polymerase activity by an enzymatic luminometric inorganic pyrophosphate 
detection assay (ELIDA). 
5 Pastinen et al.(1997) describe a method for multiplex detection of single nucleotide 

polymorphism in which the solid phase minisequencing principle is applied to an oligonucleotide 
array format. High-density arrays of DNA probes attached to a solid support (DNA chips) are 
further described below. 

In one aspect the present invention provides polynucleotides and methods to genotype one or 

10 more biallelic markers of the present invention by performing a microsequencing assay. Preferred 
microsequencing primers include the nucleotide sequences Dl to Dn and El to En. It will be 
appreciated that the microsequencing primers listed in Example 5 are merely exemplary and that, 
any primer having a 3' end immediately adjacent to the polymorphic nucleotide may be used. 
Similarly, it will be appreciated that microsequencing analysis may be performed for any biallelic 

15 marker or any combination of biallelic markers of the present invention. One aspect of the present 
invention is a solid support which includes one or more microsequencing primers listed in Example 
5, or fragments comprising at least 8, 12, 15, 20, 25, 30, 40, or 50 consecutive nucleotides thereof, to 
the extent that such lengths are consistent with the primer described, and having a 3' terminus 
immediately upstream of the corresponding biallelic marker, for determining the identity of a 

20 nucleotide at a biallelic marker site. 

3) Mismatch detection assays based on polymerases and ligases 

In one aspect the present invention provides polynucleotides and methods to determine the 
allele of one or more biallelic markers of the present invention in a biological sample, by mismatch 
detection assays based on polymerases and/or ligases. These assays are based on the specificity of 

25 polymerases and ligases. Polymerization reactions places particularly stringent requirements on 
correct base pairing of the 3' end of the amplification primer and the joining of two oligonucleotides 
hybridized to a target DNA sequence is quite sensitive to mismatches close to the ligation site, 
especially at the 3' end. Methods, primers and various parameters to amplify DNA fragments 
comprising biallelic markers of the present invention are further described above in "Amplification 

30 Of DNA Fragments Comprising Biallelic Markers". 

Allele Specific Amplification Primers 
Discrimination between the two alleles of a biallelic marker can also be achieved by allele 
specific amplification, a selective strategy, whereby one of the alleles is amplified without 
amplification of the other allele. For allele specific amplification, at least one member of the pair of 

35 primers is sufficiently complementary with a region of an olfactory receptor gene comprising the 
polymorphic base of a biallelic marker of the present invention to hybridize therewith and to initiate 
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the amplification. Such primers are able to discriminate between the two alleles of a biallelic 
marker. 

This is accomplished by placing the polymorphic base at the 3' end of one of the 
amplification primers. Because the extension forms from the 3 'end of the primer, a mismatch at or 
5 near this position has an inhibitory effect on amplification. Therefore, under appropriate 

amplification conditions, these primers only direct amplification on their complementary allele. 
Determining the precise location of the mismatch and the corresponding assay conditions are well 
within the ordinary skill in the art. 

Ligation/ Ampiification Based Methods 

10 The "Oligonucleotide Ligation Assay" (OLA) uses two oligonucleotides which are designed 

to be capable of hybridizing to abutting sequences of a single strand of a target molecules. One of 
the oligonucleotides is biotinylated, and the other is detectably labeled. If the precise 
complementary sequence is found in a target molecule, the oligonucleotides will hybridize such that 
their termini abut, and create a ligation substrate that can be captured and detected. OLA is capable 

15 of detecting single nucleotide polymorphisms and may be advantageously combined with PGR as 
described by Nickerson et al.(1990). In this method, PGR is used to achieve the exponential 
amplification of target DNA, which is then detected using OLA. 

Other amplification methods which are particularly suited for the detection of single 
nucleotide polymorphism include LGR (ligase chain reaction). Gap LGR (GLGR) which are 

20 described above in "DNA Amplification". LGR uses two pairs of probes to exponentially amplify a 
specific target. The sequences of each pair of oligonucleotides, is selected to permit the pair to 
hybridize to abutting sequences of the same strand of the target. Such hybridization forms a 
substrate for a template-dependant ligase. In accordance with the present invention, LGR can be 
performed with oligonucleotides having the proximal and distal sequences of the same strand of a 

25 biallelic marker site. In one embodiment, either oligonucleotide will be designed to include the 
biallelic marker site. In such an embodiment, the reaction conditions are selected such that the 
oligonucleotides can be ligated together only if the target molecule either contains or lacks the 
specific nucleotide that is complementary to the biallelic marker on the oligonucleotide. In an 
alternative embodiment, the oligonucleotides will not include the biallelic marker, such that when 

30 they hybridize to the target molecule, a "gap" is created as described in WO 90/01069. This gap is 
then "filled" with complementary dNTPs (as mediated by DNA polymerase), or by an additional 
pair of oligonucleotides. Thus at the end of each cycle, each single strand has a complement capable 
of serving as a target during the next cycle and exponential allele-specific amplification of the 
desired sequence is obtained. 

35 Ligase/Polymerase-mediated Genetic Bit Analysis™ is another method for determining the 

identity of a nucleotide at a preselected site in a nucleic acid molecule (WO 95/21271). This method 
involves the incorporation of a nucleoside triphosphate that is complementary to the nucleotide 
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present at the preselected site onto the terminus of a primer molecule, and their subsequent Hgation 
to a second ohgonucleotide. The reaction is monitored by detecting a specific label attached to the 
reaction's solid phase or by detection in solution. 
4) Hybridization Assay Methods 
5 A preferred method of determining the identity of the nucleotide present at a biallelic marker 

site involves nucleic acid hybridization. The hybridization probes, which can be conveniently used 
in such reactions, preferably include the probes defined herein. Any hybridization assay may be 
used including Southern hybridization, Northern hybridization, dot blot hybridization and solid- 
phase hybridization (see Sambrook et al., 1989). 

10 Hybridization refers to the formation of a duplex structure by two single stranded nucleic 

acids due to complementary base pairing. Hybridization can occur between exactly complementary 
nucleic acid strands or between nucleic acid strands that contain minor regions of mismatch. 
Specific probes can be designed that hybridize to one form of a biallelic marker and not to the other 
and therefore are able to discriminate between different allelic forms. Allele-specific probes are 

15 often used in pairs, one member of a pair showing perfect match to a target sequence containing the 
original allele and the other showing a perfect match to the target sequence containing the alternative 
allele. Hybridization conditions should be sufficiently stringent that there is a significant difference 
in hybridization intensity between alleles, and preferably an essentially binary response, whereby a 
probe hybridizes to only one of the alleles. Stringent, sequence specific hybridization conditions, 

20 under which a probe will hybridize only to the exactly complementary target sequence are well 
known in the art (Sambrook et al., 1989). Stringent conditions are sequence dependent and will be 
different in different circumstances. Generally, stringent conditions are selected to be about 5°C 
lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and 
pH. Although such hybridization can be performed in solution, it is preferred to employ a solid- 

25 phase hybridization assay. The target DNA comprising a biallelic marker of the present invention 
may be amplified prior to the hybridization reaction. The presence of a specific allele in the sample 
is determined by detecting the presence or the absence of stable hybrid duplexes formed between the 
probe and the target DNA. The detection of hybrid duplexes can be carried out by a number of 
methods. Various detection assay formats are well known which utilize detectable labels bound to 

30 either the target or the probe to enable detection of the hybrid duplexes. Typically, hybridization 
duplexes are separated from unhybridized nucleic acids and the labels bound to the duplexes are then 
detected. Those skilled in the art will recognize that wash steps may be employed to wash away 
excess target DNA or probe as well as unbound conjugate. Further, standard heterogeneous assay 
formats are suitable for detecting the hybrids using the labels present on the primers and probes. 

35 Two recently developed assays allow hybridization-based allele discrimination with no need 

for separations or washes (see Landegren U. et al., 1998). The TaqMan assay takes advantage of 
the 5' nuclease activity of Taq DNA polymerase to digest a DNA probe annealed specifically to the 
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accumulating amplification product. TaqMan probes are labeled with a donor-acceptor dye pair that 
interacts via fluorescence energy transfer. Cleavage of the TaqMan probe by the advancing 
polymerase during amplification dissociates the donor dye from the quenching acceptor dye, greatly 
increasing the donor fluorescence. All reagents necessary to detect two allelic variants can be 
5 assembled at the beginning of the reaction and the results are monitored in real time (see Livak et al., 
1995). In an alternative homogeneous hybridization based procedure, molecular beacons are used 
for allele discriminations. Molecular beacons are hairpin-shaped oligonucleotide probes that report 
the presence of specific nucleic acids in homogeneous solutions. When they bind to their targets 
they undergo a conformational reorganization that restores the fluorescence of an internally 

10 quenched fluorophore (Tyagi et al., 1998). 

The polynucleotides provided herein can be used to produce probes which can be used in 
hybridization assays for the detection of biallelic marker alleles in biological samples. These probes 
are characterized in that they preferably comprise between 8 and 50 nucleotides, and in that they are 
sufficiently complementary to a sequence comprising a biallelic marker of the present invention to 

15 hybridize thereto and preferably sufficiently specific to be able to discriminate the targeted sequence 
for only one nucleotide variation. A particularly preferred probe is 25 nucleotides in length. 
Preferably the biallelic marker is within 4 nucleotides of the center of the polynucleotide probe. In 
particularly preferred probes, the biallelic marker is at the center of said polynucleotide. Preferred 
probes comprise a nucleotide sequence selected from the group consisting of amplicons listed in 

20 Table 1 and the sequences complementary thereto, or a fragment thereof, said fragment comprising 
at least about 8 consecutive nucleotides, preferably 10, 15, 20, more preferably 25, 30, 40, 47, or 50 
consecutive nucleotides and containing a polymorphic base. Preferred probes comprise a nucleotide 
sequence selected from the group consisting of PI to PI 3 and the sequences complementary thereto. 
In preferred embodiments the polymorphic base(s) are within 5, 4, 3,2, 1, nucleotides of the center 

25 of the said polynucleotide, more preferably at the center of said polynucleotide. 

Preferably the probes of the present invention are labeled or immobilized on a solid support. 
Labels and solid supports are further described in "Oligonucleotide Probes and Primers". The 
probes can be non-extendable as described in "Oligonucleotide Probes and Primers". 

By assaying the hybridization to an allele specific probe, one can detect the presence or 

30 absence of a biallelic marker allele in a given sample. High-Throughput parallel hybridization in 
array format is specifically encompassed within "hybridization assays" and are described below. 
5) Hybridization To Addressable Arrays Of Oligonucleotides 

Hybridization assays based on oligonucleotide arrays rely on the differences in hybridization 
stability of short oligonucleotides to perfectly matched and mismatched target sequence variants. 

35 Efficient access to polymorphism information is obtained through a basic structure comprising high- 
density arrays of oligonucleotide probes attached to a solid support (e.g., the chip) at selected 



wo 00/21985 PCT/IB99/01729 

53 

positions. Each DNA chip can contain thousands to milhons of individual synthetic DNA probes 
arranged in a grid-Hke pattern and miniaturized to the size of a dime. 

The chip technology has already been applied with success in numerous cases. For example, 
the screening of mutations has been undertaken in the BRCAl gene, in 5. cerevisiae mutant strains, 
5 and in the protease gene of HIV-1 virus (Hacia et al., 1996; Shoemaker et al., 1996; Kozal et ah, 
1996). Chips of various formats for use in detecting biallelic polymorphisms can be produced on a 
customized basis by Affymetrix (GeneChip^^), Hyseq (HyChip and HyGnostics), and Protogene 
Laboratories. 

hi general, these methods employ arrays of oligonucleotide probes that are complementary 

10 to target nucleic acid sequence segments from an individual which, target sequences include a 
polymorphic marker. EP 785280 describes a tiling strategy for the detection of single nucleotide 
polymorphisms. Briefly, arrays may generally be "tiled" for a large number of specific 
polymorphisms. By "tiling" is generally meant the synthesis of a defined set of oligonucleotide 
probes which is made up of a sequence complementary to the target sequence of interest, as well as 

15 preselected variations of that sequence, e.g., substitution of one or more given positions with one or 
more members of the basis set of nucleotides. Tiling strategies are further described in PCT 
application No. WO 95/1 1995. In a particular aspect, arrays are tiled for a number of specific, 
identified biallelic marker sequences. In particular, the array is tiled to include a number of 
detection blocks, each detection block being specific for a specific biallelic marker or a set of 

20 biallelic markers. For example, a detection block may be tiled to include a number of probes, which 
span the sequence segment that includes a specific polymorphism. To ensure probes that are 
complementary to each allele, the probes are synthesized in pairs differing at the biallelic marker. In 
addition to the probes differing at the polymorphic base, monosubstituted probes are also generally 
tiled within the detection block. These monosubstituted probes have bases at and up to a certain 

25 number of bases in either direction from the polymorphism, substituted with the remaining 

nucleotides (selected from A, T, G, C and U), Typically the probes in a tiled detection block will 
include substitutions of the sequence positions up to and including those that are 5 bases away fi-om 
the biallelic marker. The monosubstituted probes provide internal controls for the tiled array, to 
distinguish actual hybridization from artefactual cross-hybridization. Upon completion of 

30 hybridization with the target sequence and washing of the array, the array is scanned to determine 
the position on the array to which the target sequence hybridizes. The hybridization data from the 
scaimed array is then analyzed to identify which allele or alleles of the biallelic marker are present in 
the sample. Hybridization and scanning may be carried out as described in PCT application No. WO 
92/10092 and WO 95/1 1995 and US patent No. 5,424,186. 

35 Thus, in some embodiments, the chips may comprise an array of nucleic acid sequences of 

fragments of about 15 nucleotides in length. In further embodiments, the chip may comprise an 
array including at least one of the sequences selected from the group consisting of amplicons listed 
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in table 1 and the sequences complementary thereto, or a fragment thereof, said fragment comprising 
at least about 8 consecutive nucleotides, preferably 10, 15, 20, more preferably 25, 30, 40, 47, or 50 
consecutive nucleotides and containing a polymorphic base. In preferred embodiments the 
polymorphic base is within 5, 4, 3, 2, 1, nucleotides of the center of the said polynucleotide, more 
5 preferably at the center of said polynucleotide. In some embodiments, the chip may comprise an 
array of at least 2, 3, 4, 5, 6, 7, 8 or more of these polynucleotides of the invention. Solid supports 
and polynucleotides of the present invention attached to solid supports are further described in 
"Oligonucleotide Probes And Primers". 
6) Integrated Systems 

10 Another technique, which may be used to analyze polymorphisms, includes multi component 

integrated systems, which miniaturize and compartmentalize processes such as PCR and capillary 
electrophoresis reactions in a single functional device. An example of such technique is disclosed in 
US patent 5,589,136 which describes the integration of PCR amplification and capillary 
electrophoresis in chips. 

15 Integrated systems can be envisaged mainly when microfluidic systems are used. These 

systems comprise a pattern of microchannels designed onto a glass, silicon, quartz, or plastic wafer 
included on a microchip. The movements of the samples are controlled by electric, electroosmotic 
or hydrostatic forces applied across different areas of the microchip to create functional microscopic 
valves and pumps with no moving parts. 

20 For genotyping biallelic markers, the microfluidic system may integrate nucleic acid 

amplification, microsequencing, capillary electrophoresis and a detection method such as laser- 
induced fluorescence detection. 

E. EXPRESSION OF AN OLl TO OLFIO CODING POLYNUCLEOTIDE 

Any of the coding polynucleotides of the invention may be inserted into recombinant vectors 
25 for expression in a recombinant host cell or a recombinant host organism. 

Thus, the present invention also encompasses a family of recombinant vectors that contains 
a coding polynucleotide from the group of coding polynucleotides OLFl to OLFIO genes. 
Consequently, the present invention further deals with a recombinant vector comprising a 
polynucleotide comprising any of the coding sequence of SEQ ID No 1, preferably those selected 
30 from the group consisting of SEQ ID Nos 2-11. 

In a first preferred embodiment, the present invention relates to expression vectors which 
include nucleic acids encoding an olfactory receptor protein described herein under the control of an 
exogenous regulatory sequence. 

In a second preferred embodiment, a recombinant vector of the invention is used to amplify 
35 ttie inserted polynucleotide derived from an olfactory receptor genomic sequence selected from the 
group consisting of the nucleic acids of SEQ ID No 1 and of olfactory receptor cDNAs, for example 
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the open reading frames of SEQ ID Nos 2-1 1 , in a suitable cell host , this polynucleotide being 
amplified at every time that the recombinant vector replicates. 

More particularly, the present invention relates to expression vectors which include nucleic 
acids encoding an olfactory receptor protein, preferably the olfactory receptor proteins of the amino 
5 acid sequence of SEQ ID Nos 1 2-21 or variants or fragments thereof, under the control of an 
exogenous regulatory sequence. 

Generally, a recombinant vector of the invention may comprise any of the polynucleotides 
described herein, including regulatory sequences, and coding sequences, as well as any olfactory 
receptor primer or probe as defined above. More particularly, the recombinant vectors of the present 
10 invention can comprise any of the polynucleotides described in the "Coding Regions of the olfactory 
receptor gene" section, "Genomic sequence of the olfactory receptor gene" section, the 
"Oligonucleotide Probes And Primers" section and the "Polynucleotide constructs" section. 

Some of the elements which can be found in the vectors of the present invention are 
described in further detail in the following sections. 

15 Vectors 

A recombinant vector according to the invention comprises, but is not limited to, a YAC 
(Yeast Artificial Chromosome), a BAC (Bacterial Artificial Chromosome), a phage, a phagemid, a 
cosmid, a plasmid or even a linear DNA molecule which may consist of a chromosomal, non- 
chromosomal and synthetic DNA. Such a recombinant vector can comprise a transcriptional unit 
20 comprising an assembly of 

(1) a genetic element or elements having a regulatory role in gene expression, for 
example promoters or enhancers. Enhancers are cis-acting elements of DNA, usually from about 
10 to 300 bp that act on the promoter to increase the transcription. 

(2) a structural or coding sequence which is transcribed into mRNA and eventually 
25 translated into a polypeptide, and 

(3) appropriate transcription initiation and termination sequences. Structural units 
intended for use in yeast or eukaryotic expression systems preferably include a leader sequence 
enabling extracellular secretion of translated protein by a host cell. Alternatively, where 
recombinant protein is expressed without a leader or transport sequence, it may include an N- 

30 terminal residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

Generally, recombinant expression vectors will include origins of replication, selectable 
markers permitting transformation of the host cell, and a promoter derived from a highly expressed 
gene to direct transcription of a downstream structural sequence. The heterologous structural 
35 sequence is assembled in appropriate phase with translation initiation and termination sequences, 
and preferably a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. 
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The selectable marker genes for selection of transformed host cells are preferably 
dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, TRPl for S. cerevisiae or 
tetracycline, rifampicine or ampicillin resistance in E. coli, or levan saccharase for mycobacteria. 

For facilitating the purification of the expressed protein and increasing its stability, the 
5 coding sequence of an olfactory receptor according to the invention can be fused in its N- or C- 
terminus with protein such as MBP (maltose binding protein) and GST (Glutathione S transferase) 
or with tag such as poly-histidine tag, Strep tag. Bio tag, and flag peptide epitope tag, those being 
detailed below, Thioredoxin can be eventually inserted between the olfactory receptor and the tag. 

Useful expression vectors for bacterial use are constructed by inserting a structural DNA 
10 sequence encoding a desired polypeptide with suitable translation initiation and termination signals 
in operable reading phase with a functional promoter. The vector will comprise one or more 
phenotypic selectable markers and an origin of replication to ensure maintenance of the vector and 
to, if desirable, provide amplification within the host. 

As a representative but non-limiting example, useful expression vectors for bacterial use can 
15 comprise a selectable marker and bacterial origin of replication derived from commercially available 
plasmids comprising genetic elements of pBR322 (ATCC 37017). Such commercial vectors include, 
for example, pKK223-3 (Pharmacia, Uppsala, Sweden), and GEMl (Promega Biotec, Madison, WI, 
USA). 

Large numbers of suitable vectors and promoters are known to those of skill in the art, and 
20 commercially available, such as bacterial vectors : pQE70, pQE60, pQE-9 (Qiagen), pbs, pDlO, 
phagescript, psiX174, pbluescript SK, pbsks, pNH8A, pNH16A, pNHlSA, pNH46A (Stratagene); 
ptrc99a, pKK223-3, pKK233-3, pDR540, pRITS (Pharmacia); or eukaryotic vectors : pWLNEO, 
pSV2CAT, pOG44, pXTl, pSG (Stratagene); pSVK3, pBPV, pMSG, pSVL (Pharmacia); 
baculovirus transfer vector pVL1392/1393 (Pharmingen); pQE-30 (QIAexpress). 
25 A suitable vector for the expression of the olfactory receptor above-defined or their peptide 

fragments is baculovirus vector that can be propagated in insect cells and in insect cell lines. A 
specific suitable host vector system is the pVL1392/1393 baculovirus transfer vector (Pharmingen) 
that is used to transfect the SF9 cell line (ATCC N*^CRL 1711) which is derived from Spodoptera 
frugiperda. 

30 Other suitable vectors for the expression of an olfactory receptor or their peptide fragments 

or variants in a baculovirus expression system include those described by Chai et al. (1993), Vlasak 

et al. (1983) and Lenhard et al. (1996). 

Mammalian expression vectors will comprise an origin of replication, a suitable promoter 

and enhancer, and also any necessary ribosome binding sites, polyadenylation site, splice donor and 
35 acceptor sites, transcriptional termination sequences, and 5' flanking nontranscribed sequences. 

DNA sequences derived from the SV40 viral genome, for example SV40 origin, early promoter, 
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enhancer, splice and polyadenylation sites may be used to provide the required nontranscribed 
genetic elements. 

Promoters 

The suitable promoter regions used in the expression vectors according to the present 
5 invention are chosen taking into account of the cell host in which the heterologous gene has to be 
expressed. 

A suitable promoter may be heterologous with respect to the nucleic acid for which it 
controls the expression or alternatively can be endogenous to the native polynucleotide containing 
the coding sequence to be expressed. Additionally, the promoter is generally heterologous with 
10 respect to the recombinant vector sequences within which the construct promoter/coding sequence 
has been inserted. 

Thus, the promoter is selected among the group comprising : 

- an internal or an endogenous promoter, such as the natural promoter associated 
with the structural gene coding for the desired olfactory receptor polypeptide or the fragment or 

15 variant thereof; such a promoter may be completed by a regulatory element derived from the 
vertebrate host, in particular an activator element; 

- a promoter derived from a cytoskeletal protein gene such as the desmin promoter 
(Bolmont et al., 1990; Zhenlin et al., 1989) or a promoter derived from a gene specifically expressed 
in epithelial cells and most preferably in olfactory epithelial cells. 

20 Preferred bacterial promoters are the Lad, LacZ, the T3 or T7 bacteriophage RNA 

polymerase promoters, the polyhedrin promoter, or the plO protein promoter from baculovirus (Kit 
Novagen) (Smith et al., 1983.; O'Reilly et al., 1992), the lambda Pr promoter or also the trc 
promoter. 

Promoter regions can be selected from any desired gene using, for example, CAT 
25 (chloramphenicol transferase) vectors and more preferably pKK232-8 and pCM7 vectors. 

Particularly preferred bacterial promoters include lad, lacZ, T3, T7, gpt, lambda PR, PL and trp. 

Eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early and late SV40, 

LTRs from retrovirus, and mouse metallothionein-L. Selection of a convenient vector and promoter 

is well within the level of ordinary skill in the art. 
30 The choice of a determined promoter, among the above-described promoters is well in the 

ability of one skill in the art, guided by his knowledge in the genetic engineering technical field, and 

by being also guided by the book of Sambrook et al. in 1989 or also by the procedures described by 

Fuller et al. in 1996 (Fuller S.A. et al., 1996). 

A preferred constitutive promoter that is used is one of the internal promoters that are active 
35 in the resting fibroblasts such the promoter of the phosphoglycerate kinase gene (PGK-1). The PGK- 

1 promoter is either the mouse promoter or the human promoter such as described by Adra et al.( 
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1987). Other constitutive promoters may also be used such that the beta-actin promoter (Kort et aL, 
1 983) or the vimentin promoter (Rettlez and Basenga, 1 987). 

The vector containing the appropriate DNA sequence as described above, more preferably a 
OLFl to OLFIO coding polynucleotide, can be utilized to transform an appropriate host to allow the 
5 expression of the desired polypeptide or polynucleotide. 

Other types of vectors 

The in vivo expression of an olfactory receptor polypeptide encompassed by the invention or 
a fragment or a variant thereof may be useful in order to correct a genetic defect related to the 
expression of the native gene in a host organism or to the production of biologically active olfactory 
10 receptor proteins. 

Consequently, the present invention also deals with recombinant expression vectors mainly 
designed for the in vivo production of a therapeutic peptide fragment by the introduction of the 
genetic information in the organism of the patient to be treated. This genetic information may be 
introduced in vitro in a cell that has been previously extracted from the organism, the modified cell 
15 being subsequently reintroduced in the said organism, directly in vivo into the appropriate tissue, and 
preferably in the olfactory epithelium. 

One specific embodiment for a method for delivering the corresponding protein or peptide to 
the interior of a cell of a vertebrate in vivo comprises the step of introducing a preparation 
comprising a physiologically acceptable carrier and a naked polynucleotide operatively coding for 
20 the polypeptide into the interstitial space of a tissue comprising the cell, whereby the naked 
polynucleotide is taken up into the interior of the cell and has a physiological effect. 

In a specific embodiment, the invention provides a composition for the in vivo production of 
an olfactory receptor polypeptide described therein containing a naked polynucleotide operatively 
coding for an olfactory receptor selected from the group of OLFl to OLFIO or a fragment or a 
25 variant thereof, in solution in a physiologically acceptable carrier and suitable for introduction into a 
tissue to cause cells of the tissue to express the said protein or polypeptide. 

Advantageously, the composition described above is administered locally, near the site in 
which the expression of the olfactory receptor polypeptide under consideration or a fragment or a 
variant thereof is sought. 

30 The polynucleotide operatively coding for an olfactory receptor polypeptide or a fragment or 

variant thereof may be a vector comprising the genomic DNA or the complementary DNA (cDNA) 
coding for the corresponding protein and a promoter sequence allowing the expression of the 
genomic DNA or the complementary DNA in the desired eukaiyotic cells, such as vertebrate cells, 
specifically mammalian cells. 

35 This vector may also contain one origin of replication that allows it to replicate in the 

eukaryotic host cell such as an origin of replication from a bovine papillomavirus. Alternatively, the 
vector can contain several, for erample two, origins of replication of different origins in order to 
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allow said vector to replicate in different host cells, typically both in a prokaryotic cell such as E. 
coli and in an eukaryotic cell such as a mammalian epithelial cell, preferably a mammalian olfactory 
epithelial cell. 

Compositions comprising a polynucleotide are described in the PCT application N° WO 
5 90/1 1 092 (Vical Inc.) and also in the PCT application N° WO 95/1 1 307 (Institut Pasteur, INSERM, 
Universite d'Ottawa) as well as in the articles of Tacson et al. (1996) and of Huygen et al. (1996). 

In another embodiment, the DNA to be introduced is complexed with DEAE-dextran 
(Pagano et al., 1967) or with nuclear proteins (Kaneda et al., 1989), with lipids (Feigner et al., 1987) 
or encapsulated within liposomes (Fraley et al., 1980). 

10 In another embodiment, the polynucleotide encoding an olfactory receptor polypeptide of 

the invention or a fragment or a variant thereof may be included in a transfection system comprising 
polypeptides that promote its penetration within the host cells as it is described in the PCT 
application WO 95/10534 (Seikagaku Corporation). They can also be encapsulated in polymer 
microparticles as it is described in the PCT Application No WO 94/27238. 

15 The vector according to the present invention may advantageously be administered in the 

form of a gel that facilitates their transfection into the cells. Such a gel composition may be a 
complex of poly-L-lysine and lactose, as described by Midoux (1993) or also poloxamer 407 as 
described by Pastore (1994). Said vector' may also be suspended m a buffer solution or be associated 
with liposomes. 

20 The amount of the vector to be injected to the desired host organism vary according to the 

site of injection. As an indicative dose, it will be injected between 0,1 and 1 00 jig of the vector in an 
animal body, preferably a mammal body, for example a mouse body. 

In another embodiment of the vector according to the invention, said vector may be 
introduced in vitro in a host cell, preferably in a host cell previously harvested from the animal to be 
25 treated and more preferably a somatic cell such as a muscle cell. In a subsequent step, the cell that 
has been transformed with the vector coding for the desired olfactory receptor polypeptide or the 
desired fragment or variant thereof is implanted back into the animal body in order to deliver the 
recombinant protein within the body either locally or systemically. 

Suitable vectors for the in vivo expression of an olfactory receptor polypeptide of the 
30 invention or a fragment or a variant thereof are described hereunder. 

In one specific embodiment, the vector is derived from an adeno\drus. Preferred 
adenoviruses vectors according to the invention are those described by Feldman and Steg (1996) or 
Ohno et al. (1994). Another preferred recombinant adenovirus according to this specific embodiment 
of the present invention is the adenovirus described by Ohwada et al. (1996) or the human 
35 adenovirus type 2 or 5 (Ad 2 or Ad 5) or an adenovirus of animal origin ( French patent application 
FR-93.05954). 
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Among the adenoviruses of animal origin it can be cited the adenoviruses of canine (CAV2, 
strain Manhattan or A26/61[ATCC VR-800]), bovine, munne (Mavl, Beard et ah, 1980) or simian 
(SAV). 

Preferably, the inventors are using recombinant defective adenoviruses that may be prepared 
5 following a technique well-known by one skill in the art, for example as described by Levrero et al. 
(1991) or by Graham (1984) or in the European patent application N° EP-1 85.573. Another 
defective recombinant adenovirus that may be used according to the present invention, as well as a 
composition of matter containing such a defective recombinant adenovirus, is described in the PCX 
application N° WO 95/14785. 

10 Retrovirus vectors and adeno-associated virus vectors are generally understood to be the 

recombinant gene delivery system of choice for the transfer of exogenous polynucleotides in vivo , 
particularly to mammals, including humans. These vectors provide efficient delivery of genes into 
cells, and the transferred nucleic acids are stably integrated into the chromosomal DNA of the host. 

The use of recombinant retrovirus vectors containing a nucleic acid according to the 

15 invention is also encompassed within the scope of the invention. A major prerequisite for the use of 
retroviruses is to ensure the safety of their use, particularly with regard to the possibility of the 
spread of wild-type virus in the cell population. The development of specialized cell lines (termed 
"packaging cells") which produce only replication defective retroviruses has increased the utility of 
retroviruses for in vivo gene delivery, and defective retroviruses are well characterized for use in 

20 gene transfer. Thus, recombinant retroviruses can be constructed in which a part of the retroviral 
coding sequence {gag^ pol, ^^v) has been replaced by nucleic acid encoding an olfactory receptor 
rendering the retrovirus defective. Protocols for producing recombinant retroviruses and for 
infecting cells in vitro and in vivo with such viruses can be found in "Current Protocols in Molecular 
Biology" (1989). 

25 Furthermore, it has been shown that it is possible to limit the infection spectrum of 

retroviruses and consequently of retroviral-based vectors, by modifying the viral packaging proteins 
on the surface of the viral particle, as described for example in the PCT Application No WO 
93/25234 or in the PCT Application No WO 94/ 06920. For instance, strategies for the modification 
of the infection spectrum of retroviral vectors include : coupling antibodies specific for cell surface 

30 antigens to the viral env protein (Julan et al., 1992) or coupling cell surface receptor ligands to the 
viral env protein (Neda et al., 1991). Coupling can be in the form of the chemical cross-linking with 
a protein or other variety (e.g. lactose to convert the env protein to an asialoglycoprotein), as well by 
generating fusion proteins (e.g. single-chain antibody/e«v fusion proteins). This technique, while 
useful to limit or otherwise direct the infection to certain tissue types, can also be used to convert an 

35 ecotropic vector into an amphotropic vector. 

Particularly preferred retroviruses for the preparation or construction of retroviral in vitro or 
in vitro gene delivery vehicles of the present invention include retroviruses selected from the group 
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consisting of Mink-Cell Focus Inducing Virus, Murine Sarcoma Virus, Reticuloendotheliosis virus 
and Rous Sarcoma virus. Particularly preferred Murine Leukemia Viruses include 4070A and 1504A 
(Hartley et al., 1976), Abelson (ATCC No VR-999), Friend (ATCC No VR-245), Gross (ATCC No 
VR-590), Rauscher (ATCC No VR-998) and Moloney Murine Leukemia Virus (ATCC No VR-190; 
5 PCT Application No WO 94/24298). Particularly preferred Rous Sarcoma Viruses include Bryan 
high titer (ATCC Nos VR-334, VR-657, VR-726, VR-659 and VR-728), Another preferred 
retroviral vector is that described by Roth et al. (Roth J.A. et al., 1996). 

Yet another viral vector system that is contemplated by the invention consists in the adeno- 
associated virus (AAV). Adeno-associated virus is a naturally occurring defective virus that requires 
10 another virus, such as an adenovirus or a herpes virus, as a helper virus for efficient replication and a 
productive life cycle (Muzyczka et al., 1992). It is also one of the few viruses that may integrate its 
DNA into non-dividing cells, and exhibits a high frequency of stable integration (Flotte et al., 1992; 
Samulski et al., 1989; McLaughlin et al,, 1989). One advantageous feature of AAV derives from its 
reduced efficacy for transducing primary cells relative to transformed cells. 

15 Cell hosts 

Another object of the invention consists in host cell that have been transformed or 
transfected with one of the polynucleotides described therein, and more precisely a polynucleotide 
comprising the coding sequence of any of the olfactory receptor polypeptide having the amino acid 
sequence of SEQ ID Nos 12-2 1 or fragments or variants thereof Are included host cells that are 

20 transformed (prokaryotic cells) or that are transfected (eukaryotic cells) with a recombinant vector 
such as one of those described above. 

A recombinant host cell of the invention comprises any one of the polynucleotides or the 
recombinant vectors described therein. More particularly, the cell hosts of the present invention can 
comprise any of the polynucleotides described in the "Coding regions of the olfactory receptor gene" 

25 section, "Genomic sequence of olfactory receptor gene " section, the "Oligonucleotide Probes And 
Primers" section, the "Polynucleotide constructs" section.and the " Expression of an OLFl to 
OLFIO coding polypeptide" section. 

Suitable prokaryotic hosts for transformation include coli. Bacillus subtilis, as well as 
various species within the genera of Streptomyces or Mycobacterium, Suitable eukaryotic hosts 

30 comprise yeast, insect cells, such as Drosophila and Sf9. Various mammalian cell hosts can also be 
employed to express recombinant protein. Examples of mammalian cell hosts include the COS-7 
lines of monkey kidney fibroblasts (Guzman, 1981), and other cell lines capable of expressing a 
compatible vector, for example the C127, 3T3, CHO, HeLa and BHK cell lines. The selection of an 
host is within the scope of the one skilled in the art. 

35 Preferred cell hosts used as recipients for the expression vectors of the invention are the 

followings : 

a) Prokaryotic host cells : Escherichia coli strains (LE. DH5-a strain) or Bacillus subtilis. 
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b) Eukaryotic host ceils : HeLa ceils (ATCC NXCL2; N°CCL2.1 ; N°CCL2.2), Cv 1 cells 
(ATCC N'^CCLTO), COS cells (ATCC N°CRL1650; N°CRL1651), Sf-9 cells (ATCC N^CRLl 71 1). 

The constructs in the host cells can be used in a conventional manner to produce the gene 
product encoded by the recombinant sequence. 
5 Following transformation of a suitable host and growth of the host to an appropnate cell 

density, the selected promoter is induced by appropriate means, such as temperature shift or 
chemical induction, and cells are cultivated for an additional period. 

Cells are typically harvested by centrifugation, disrupted by physical or chemical means, and 
the resulting crude extract retained for further purification. 
10 Microbial cells employed in expression of proteins can be disrupted by any convenient 

method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. Such methods are well known by the skill artisan. 

Transgenic animals 

The terms "transgenic animals" or "host animals" are used herein designate animals that 
15 have their genome genetically and artificially manipulated so as to include one of the nucleic acids 
according to the invention. Preferred animals are non-human mammals and include those belonging 
to a genus selected fromMw5 (e.g. mice), Rattus (e.g. rats) and Oryctogalus (e.g. rabbits) which have 
their genome artificially and genetically altered by the insertion of a nucleic acid according to the 
invention. 

20 The transgenic animals of the invention all include within a plurality of their cells a cloned 

recombinant or synthetic DNA sequence, more specifically one of the purified or isolated nucleic 
acids comprising an olfactory receptor coding sequence selected from the group OLFl to OLFIO an 
olfactory receptor regulatory polynucleotide or a DNA sequence encoding an antisense 
polynucleotide such as described in the present specification. 

25 More particularly, transgenic animals according to the invention contain in their somatic 

cells and/or in their germ line cells any of the polynucleotides described in the "Coding regions of 
the olfactory receptor gene" section, "Genomic sequence of olfactory receptor gene " section, the 
"Oligonucleotide Probes And Primers" section, the "Polynucleotide constructs" section and the " 
Expression of an OLFl to OLFIO coding polypeptide" section. 

30 The replacement of the native genomic olfactory receptor sequence by a defective copy of 

said sequence may be preformed by techniques of gene targeting. Such techniques are notably 
described by Burright et al. (1997), Bates et al. (1997), Mangiarini et al. (1997), Davies et al. (1997). 

Second preferred transgenic animals of the invention have the murine olfactory receptor 
gene replaced either by a defective copy of the murine olfactory receptor gene or by an interrupted 

35 copy of the human olfactory receptor gene. A "defective copy" of a murine or a human olfactory 
receptor gene, is intended to designate a modified copy of these genes that is not or poorly 
transcribed in the resulting recombinant host animal or a modified copy of these genes leading to the 
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absence of synthesis of the corresponding translation product or alternatively leading to a modified 
and/or truncated translation product lacking the biological activity of the wild type olfactory receptor 
protein. The altered translation product thus contains amino acid modifications, deletions and 
substitutions. Modifications and deletions may render the naturally occurring gene nonfunctional, 
5 thus leading to a "knockout animal". These transgenic animals are critical for the creation of animal 
models of human diseases, and for eventual treatment of disorders related to alteration of the 
olfactory perception of odorant substances or molecules. Examples of such knockout mice are 
described in the PCT Applications Nos WO 97/34641, WO 96/12792 and WO 98/02354. 

The endogenous murine olfactory receptor gene can be interrupted by the insertion, between 
10 two contiguous nucleotide of said gene, of a part of all of a marker gene placed under the control of 
the appropriate promoter, for example the endogenous promoter of the endogenous murine olfactory 
receptor gene. The marker gene may be the neomycin resistance gene {neo) that may be operably 
linked to the phosphoglycerate kinase- 1 (PGK-1) promoter, as described in the PCT Application No 
WO 98/02534. 

15 Thus, the invention is also directed to a transgenic animal contain in their somatic cells 

and/or in their germ line cells a polynucleotide selected from the following group of 
polynucleotides: 

a) a defective copy of the human olfactory receptor gene; 

b) a defective copy of the endogenous olfactory receptor gene, wherein the expression 

20 "endogenous olfactory receptor gene" designates an olfactory receptor gene that is naturally present 
within the genome of the animal host to be genetically modified. 

The invention also concerns a method for obtaining transgenic animals, wherein said 
methods comprise the steps of : 

a) replacing the endogenous copy of the animal olfactory receptor gene by a nucleic acid 
25 selected from the group consisting of a defective copy of the human olfactory receptor gene and a 

defective copy of the endogenous olfactory receptor gene in animal cells, preferably embryonic stem 
cells (ES); 

b) introducing the recombinant animal cells obtained at step a) in embryos, notably 
blastocysts of the animal; 

30 c) selecting the resulting transgenic animals, for example by detecting the defective copy of 

an olfactory receptor gene with one or several primers or probes according to the invention. 

Optionally, the transgenic animals may be bred together in order to obtain homozygous 
transgenic animals for the defective copy of the olfactory receptor gene introduced. 

The transgenic animals of the invention thus contain specific sequences of exogenous 
35 genetic material such as the nucleotide sequences described above in detail. 

In a preferred embodiment, these transgenic animals may be good experimental models in 
order to study the diverse pathologies related to disorders associated to alteration of the olfactory 
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perception of odorant substances or molecules, in particular concerning the transgenic animals 
within the genome of which has been inserted one or several copies of a polynucleotide encoding a 
native olfactory receptor protein, or alternatively a mutant olfactory receptor protein. 

Third preferred transgenic animals according to the mvention contains in their somatic cells 
5 and/or in their germ line cells a polynucleotide selected from the following group of polynucleotides 

a) purified or isolated nucleic acid encoding an olfactory receptor polypeptide selected 
from OLFl to OLFIO, or a polypeptide fragment or variant thereof. 

b) a purified or isolated nucleic comprising at least 8 consecutive nucleotides of the 

10 nucleotide sequence SEQ ID No 1, a nucleotide sequence complementary thereto or a fragment 
or a variant thereof; 

c) a purified or isolated nucleic acid comprising a nucleotide sequence selected from the 
group of SEQ ID 2-1 1 , a sequence complementary thereto or a fragment or a variant thereof. 

The transgenic animals of the invention thus contain specific sequences of exogenous 
15 genetic material such as the nucleotide sequences described above in detail. 

In a first preferred embodiment, these transgenic animals may be good experimental models 
in order to screen the candidate substance of interest interacting with the olfactory receptor under 
consideration. 

Since it is possible to produce transgenic animals of the invention using a variety of different 

20 sequences, a general description will be given of the production of transgenic animals by referring 
generally to exogenous genetic material. This general description can be adapted by those skilled in 
the art in order to incorporate the DNA sequences into animals. For more details regarding the 
production of transgenic animals, and specifically transgenic mice, it may be referred to Sandou et 
al. (1994) and also to US Patents Nos 4,873,191, issued Oct.lO, 1989, 5,968,766, issued Dec. 16, 

25 1997 and 5,387,742, issued Feb. 28, 1995. 

Transgenic animals of the present invention are produced by the application of procedures 
which result in an animal with a genome that incorporates exogenous genetic material which is 
integrated into the genome. The procedure involves obtaining the genetic material, or a portion 
thereof, which encodes either a coding sequence, a non-coding polynucleotide or a DNA sequence 

30 encoding an antisense polynucleotide of an olfactory receptor selected from the group OLFl to 
OLFIO such as described in the present specification. 

A recombinant polynucleotide of the invention is inserted into an embryonic or ES stem cell 
line. The insertion is made using electroporation. The cells subjected to electroporation are screened 
(e.g. Southern blot analysis) to find positive cells which have integrated the exogenous recombinant 

35 polynucleotide into their genome. An illustrative positive-negative selection procedure that may be 
used according to the invention is described by Mansour et al. (1988). Then, the positive cells are 
isolated, cloned and injected into 3,5 days old blastocysts fi*om mice. The blastocysts are then 
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inserted into a female host animal and allowed to grow to term. The offspnngs of the female host 
are tested to determine which animals are transgenic e.g. include the inserted exogenous DNA 
sequence and which are wild-type. 

Thus, the present invention also concerns a transgenic animal containing a nucleic acid, a 
5 recombinant expression vector or a recombinant host cell according to the invention. 

Recombinant Cell Lines Derived From The Transgenic Animals Of The Invention. 

A further object of the invention comprises recombinant host cells obtained from a 
transgenic animal described herein. In one embodiment the invention encompasses cells derived 
from non-human host mammals and animals comprising a recombinant vector of the invention or an 
10 olfactory receptor gene disrupted by homologous recombination with a knock out vector. 

Recombinant cell lines may be established in vitro from cells obtained from any tissue of a 
transgenic animal according to the invention, for example by transfection of primary cell cultures 
with vectors expressing o«c-genes such as SV40 large T antigen, as described by Chou (1989) and 
Shayet al.(1991). 

15 F. METHODS FOR SCREENING SUBSTANCES OR MOLECULES 
INTERACTING WITH AN OLFACTORY RECEPTOR PROTEIN 

The present invention pertains to methods for screening substances of interest, in particular 
odorant substances or molecules that interact with an olfactory receptor protein selected from the 
group consisting of OLFl to OLFIO, or one peptide fragment or variant thereof. In one embodiment, 

20 the candidate substance is devoid of odorant propriety but it is able to bind the olfactory receptor and 
to trigger the transduction of signals. 

For the purpose of the present invention, a ligand means a molecule, such as a protein, a 
peptide, an antibody or any synthetic chemical compound capable of binding to the olfactory 
receptor protein or one of its fragments or variants or to modulate the expression of the 

25 polynucleotide coding for olfactory receptor or a fragment or variant thereof 

In the ligand screening method according to the present invention, a biological sample or a 
defined molecule to be tested as a putative ligand of the olfactory receptor protein is brought into 
contact with the corresponding purified olfactory receptor protein, for example the corresponding 
purified recombinant olfactory receptor protein produced by a recombinant cell host as described 

30 herein, in order to form a complex between this protein and the putative ligand molecule to be tested. 
As an illustrative example, to study the interaction of the olfactory receptor protein, or a 
fragment comprising comprising any of the fragments described in the section "OLFl to OLFIO 
proteins and polypeptide fragments" with drugs or small molecules, such as molecules generated 
through combinatorial chemistry approaches, the microdialysis coupled to HPLC method described 

35 by Wang et al. (1997) or the affinity capillary electrophoresis method described by Bush et al. 
(1997) can be used. 
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In further methods, peptides, drugs, fatty acids, hpoproteins, or small molecules which 
interact with the olfactory receptor protein, or a fragment comprising any of the fragments described 
in the section "OLFl to OLFIO proteins and polypeptide fragments" may be identified using assays 
such as the following. The molecule to be tested for binding is labeled with a detectable label, such 
5 as a fluorescent , radioactive, or enzymatic tag and placed in contact with immobilized olfactory 
receptor protein, or a fragment thereof under conditions which permit specific binding to occur, such 
as affinity columns. In some embodiments, chimeric proteins containing the olfactory receptor 
protein fused to proteins facilitating purification, such as glutathion S transferase (GST) are used. 
After removal of non-specifically bound molecules, bound molecules are detected using appropriate 
10 means. 

hi one embodiment, proteins, peptides, carbohydrates, lipids, or small molecules generated 
by combinatorial chemistry interacting with the olfactory receptor protein, or a fragment or a variant 
thereof can also be screened by using an Optical Biosensor as described in Edwards and 
Leatherbarrow (1997) and also in Szabo et al. (1995). The main advantage of the method is that it 
15 allows the determination of the association rate between the olfactory receptor protein and molecules 
interacting with the olfactory receptor protein. It is thus possible to select specifically ligand 
molecules interacting with the olfactory receptor protein, or a fragment thereof, through strong or 
conversely weak association constants. 

Another object of the present invention comprises methods and kits for the screening of 
20 candidate substances that interact with olfactory receptor polypeptide. 

The present invention pertains to methods for screening substances of interest that interact 
with an olfactory receptor protein or one fragment or variant thereof By their capacity to bind 
covalently or non-covalently to an olfactory receptor protein or to a fragment or variant thereof, 
these substances or molecules may be advantageously used both in vitro and in vivo. In vitro, said 
25 interacting molecules may be used as detection means in order to identify the presence of an 
olfactory receptor protein in a sample, preferably a biological sample. 

A first method for the screening of a candidate substance interacting with an olfactory 
receptor polypeptide selected from the group consisting of SEQ ID Nos 12-21, or fragments or 
variants thereof, comprises the following steps : 
30 a) providing a polypeptide selected firom the group consisting of the polypeptides 

comprising, consisting essentially of, or consisting of the amino acid sequences of SEQ ID 

Nos 12-21 , or a peptide fragment or a variant thereof; 

b) obtaining a candidate substance; 

c) bringing into contact said polypeptide with said candidate substance; and 

35 d) detecting the complexes formed between said polypeptide and said candidate 

substance. 
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Various candidate substances or molecules can be assayed for interaction with an olfactory 
receptor polypeptide. These substances or molecules include, without being limited to, natural or 
synthetic organic compounds or molecules of biological origin such as polypeptides. When the 
candidate substance or molecule comprises a polypeptide, this polypeptide may be the resulting 
5 expression product of either a phage clone belonging to a phage-based random peptide library, or of 
a cDNA library cloned in a vector suitable for performing a two-hybrid screening assay. 

In one embodiment of the screening method defined above, the complexes formed between 
the polypeptide and the candidate substance are further incubated in the presence of a polyclonal or a 
monoclonal antibody that specifically binds to the olfactory receptor protein of the invention under 

10 consideration or to said peptide fragment or variant thereof. 

In another embodiment of the present screening method, increasing concentrations of a 
substance competing for binding to the olfactory receptor with the considered candidate substance is 
added, simultaneously or prior to the addition of the candidate substance or molecule, when 
performing step c) of said method. By this technique, the detection and optionally the quantification 

15 of the complexes formed between the olfactory receptor protein or the peptide fragment or variant 
thereof and the candidate substance or molecule to be screened allows the one skilled in the art to 
determine the affinity value of said substance or molecule for said olfactory receptor protein or the 
peptide fragment or variant thereof. 

The olfactory receptor selected from the group consisting of OLFl to OLFIO, or a peptide 

20 fragment or a variant thereof, can be overexpressed and purified in a bacterial system such as E coli 
as described in Kiefer et al. (1996) and Tucker et al. (1996). The olfactory receptor coding sequence 
can be fused to its N-terminus with GST (Glutathione S transferase) or MBP (Maltose Binding 
Protein) and to its C-terminus with poly-histidine tag, Bio tag or Strep tag for facilitating the 
purification of the expressed protein. The Bio tag is 13 amino acid residues long, is biotinylated in 

25 vivo in E. coli, and will therefore bind to both avidin and streptavidin. The Strep tag is 9 amino acid 
residues long and binds specifically to streptavidin, but not to avidin. Therewith, a purification step 
by affinity can be carried out based on the interaction of a poly-histidine tail with immobilized metal 
ions, of the biotinylated Bio tag with monomeric avidin, of the Strep tag with streptavidin, of the 
GST segment with the glutathione, or of the MBP segment with the maltose. Thioredoxin can be 

30 eventually inserted between the receptor C-terminus and the tag and could increase the expression 
level. The fusion protein is solubilized in 1% N-laurroyl sarcosine, and 0.2 % digitonin is added. It is 
purified by affinity chromatography. The MBP, GST or tag segment can be then removed. After the 
olfactory receptor protein purification, sarcosyl can be replaced with digitonin which is a detergent 
widely used to stabilize the G protein-coupled receptors. The purified receptor is reconstituted into 

35 lipid vesicles preferably composed of phosphatidylcholine: phosphatidylglycerol (4: 1) by adding the 
lipid dissolved in dodecyl maltoside and removing the detergent. 
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The olfactory receptor selected from the group consisting of OLFl to OLFIO, or a peptide 
fragment or a variant thereof, can also be overexpressed and purified in a baculovirus/Sf9 system as 
described in Nekrasova et al. (1996). The olfactory receptor gene, or a fragment thereof, is 
preferably expressed with a "flag" peptide epitope tag and/or a poly-histidine tag to either its N- or 
5 C-terminus for facilitating the purification of the expressed protein. Therefore, the olfactory receptor 
gene, or a fragment or a variant thereof, is preferably subcloned into the baculovirus transfer vector 
pAcSGHisNT to create constructs that encoded olfactory receptor with amino-terminal poly- 
histidine tag. The resulting transfer vector is transfected preferably with BaculoGold DNA into Sf9 
cells. The expressed olfactory receptors are then solubilized either in 1 % N-lauryl sarcosine or 1.5 
10 % lysophosphatidylcholine, but preferably in lysophosphatidylcholine. After solubilization, the 
olfactory receptors are purified by affinity chromatography on nickel nitrilotriacetic acid resin and 
by cation-exchange chromatography with carboxymethyl sepharaose cation-exchange column. The 
tag segment can be then removed. The purified receptor is reconstituted into lipid vesicles preferably 
composed of dimyritoylglycerophosphocholine, cholesterol, dialmitoylgycerophosphoserine and 
15 dipalmitoylglycerophosphoethanolamine (in molecular ratio 54:35:10:1) 

Once the olfactory receptor protein or one of its peptide fragments or vanants has been 
obtained as described above, candidate substances or molecules can then be assayed for their 
capacity to bind thereto. 

The candidate substance or molecule to be assayed for interacting with an olfactory receptor 
20 of the invention may be of diverse nature, including, without being limited to, natural or synthetic 
organic compounds or molecules of biological origin such as peptide. It can comprise aromatic or 
aliphatic compounds with various functional groups such as alcohol, aldehyde, ester, ether, ketone, 
carboxylic, amine. An example of a substance panel which can be used is provided by Zhao et al. 
(1998). 

25 The screening of substances or molecules interacting with an olfactory receptor, or a 

fragment thereof, is carried out by photoaffinity labeling experiments described in Kiefer et al. 
(1996). The odorant is labeled, preferably radiolabeled, and incubated with lipid vesicles including 
the purified olfactory receptor. The odorants bound to the olfactory receptors are crosslinked by 
exposure to ultraviolet light. Then, the samples are subjected to SDS polyacrylamide gel 

30 electrophoresis. Proteins are visualized by Coomassie-blue staining and the odorants are revealed, 
preferably by autoradiography. In another embodiment, the proteins can be visualized by Western 
Blot with a polyclonal or monoclonal antibody that specifically binds to the olfactory receptor under 
consideration. Once a substance binding to the considered olfactory receptor is identified, the 
binding specificity of this substance is confirmed with competition experiments demonstrating that 

35 increasing concentrations of unlabeled ligand accomplish a dose-dependent displacement of the 
radioactive ligand. 
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The identification of a first substance specific to one of the olfactory receptors of the present 
invention facilitates the screening of other substances. Indeed, the binding capacity of the screened 
substances to this olfactory receptor can be carried out through a competition experiments against 
the first identified substance which is labeled. 
5 The invention also pertains to kits useful for performing the hereinbefore described 

screening method. Preferably, such kits comprise a polypeptide selected form the group consisting of 
the polypeptides comprising the amino acid sequences SEQ ID Nos 12-21 or a peptide fragment or a 
variant thereof, and optionally means useful to detect the complex formed between the considered 
olfactory receptor polypeptide or its peptide fragment or variant and the candidate substance. In a 
10 preferred embodiment, the kit can comprise an already identified substance specific of the olfactory 
receptor under consideration which is labeled, preferably radiolabeled, and a monoclonal or 
polyclonal antibody directed against the considered olfactory receptor, 

A second screening method embodiment consists of a method for the screening of ligand 
molecules interacting with an olfactory receptor polypeptide selected from the group consisting of 
1 5 SEQ ID Nos 12-21, wherein said method comprises : 

a) providing a recombinant eukaryotic host cell containing a nucleic acid encoding a 
polypeptide selected from the group comprising, consisting essentially of, or consisting the 
polypeptides comprising the amino acid sequences SEQ ID Nos 12-21, or variants or 
fragments thereof; 

20 b) preparing membrane extracts of said recombinant eukaryotic host cell; 

c) bringing into contact the membrane extracts prepared at step b) with a selected 
ligand molecule; and 

d) detecting the production level of second messengers metabolites. 

The baculovirus-Sf9 cell system enables a foreign DNA encoding an olfactory receptor 
25 selected from the group consisting of OLFl to OLFIO, or a peptide fragment or a variant thereof, to 
be expressed with high efficiency. Moreover, it can be used to couple a heterologous expressed 
olfactory receptor to the second messenger cascades. Therefore, the binding specificity of an 
olfactory receptor can be assessed through an assay of odorant-induced generation of cAMP or 
inositol triphosphate (InsP3) described in Raming et al. (1993). 
30 Briefly, a cell line derived from Sf9 is infected by baculovirus, such as baculovirus transfer 

vector pVL1393, harboring DNA encoding the olfactory receptor or a fragment thereof downstream 
from a strong promoter, preferably the polyhedrin promoter. Recombinant virus are purified and 
used to infect 1 .5 x 10* Sf9 cells in 100 ml spinner cultures at high multiplicity of infection. Cells are 
collected after a postinfection delay, preferably 48 h, and membrane fractions are isolated as follow. 
35 Cells are pelleted (at 250g for 1 0 min at 4°C), washed with Ringer solution (120 mM NaCl, 

5 mM KCl, 1 .6 mM K2HPO4, 1 .2 mM MgSOa, 25 mM NaHCOi, 5 mM glucose, pH7.4) and 
disrupted using a glass homogenizer in homogenization buffer (10 mM Tris-HCl, pH 8.0, 2 mM 
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EGTA, 3 mM MgC12) containing antiproteases. The homogenate is centrifuged and the pellet is 
washed. Supematants are centrifuged at 33,000g for 20 min. The final pellet is resuspended in 
homogenization buffer and the protein concentration is determined. 

Assay of odorant substance-induced generation of second messengers cAMP and InsP3 is 
5 performed as follow. Suspensions of Sf9 cell membrane preparations (300 [ig protein) are rapidly 
mixed with a stimulation buffer (200 mM NaCl, 10 mM EGTA, 50 mM MOPS, 2.5 mM MgC^, 1 
mM DTT, 0.05 % Na-cholate, 1 mM ATP, 1 ^M GTP, and 0.02 \iM free Ca"*) containing the 
candidate substances at the appropriate concentrations. The reaction is stopped after a short time, 
preferably 1 sec, by injecting 10 % Perchloric acid. Quenched samples are assayed for second 
10 messenger concentrations. The quenched and cooled samples are vortexed and centrifuged for 5 min 
at 2500g at 4''C. 400 ^1 of the supematants are transferred to a separate tube containing 100 ^l of 10 
mM EDTA (pH 7). The sample are neutralized by adding 500 |il of a 1 :1 (v/v) mixture of 1,1,2 
trichlorofluoroethane, followed by thorough mixing. After centrifugation for 2 min at 500g, three 
phases are obtained. The upper phase, which contains all water soluble components, is used for 
15 carrying out the concentration measurements. cAMP and InsP3 concentrations are determined 
according the procedure of Steiner et al. (1972) and Palmer et al, (1989), respectively. 

The invention also concerns a kit for the screening of odorant ligand molecules interacting 
with an olfactory receptor polypeptide selected from the group consisting of the polypeptides 
comprising the amino acid sequences SEQ ID Nos 12-21, wherein said kit comprises : 
20 a) a recombinant eukaryotic host cell containing a nucleic acid encoding a 

polypeptide selected from the group comprising, consisting essentially of, or consisting of 
the polypeptides comprising the amino acid sequences SEQ ID Nos 12-21 or variants or 
fragments thereof; and 

b) optionally, reagents necessary for the measurement of second messenger 
25 metabolites in a sample. 

The screening of substances or molecules interacting with an olfactory receptor, or a 
fragment thereof, can also be carried out through the measurement of the increase of the response to 
odorants in an olfactory epithelium overexpressing an olfactory receptor selected from the group 
consisting of OLFl to OLFIO, or a peptide fragment or a variant thereof, as described in Zhao et al. 
30 (1998). The response is assessed by electro-olfactogram which measures a transepithelial potential 
resulting from the sunmied activity of many olfactory neurons. In order to overexpress the olfactory 
receptor, or a fragment thereof, in an olfactory epithelium, an adenovirus containing the olfactory 
receptor gene is generated. To aid in electro-olfactogram electrode placements, the olfactory 
receptor coding sequence is preferably combined in the adenovirus with the physiological marker 
35 green fluorescent protein (GFP) in such manner that the two proteins are simultaneously expressed. 
The olfactory epithelium of an animal, preferably of a rat, is infected by the adenovirus. Animals are 
killed 3 to 8 days after infection and the nasal cavity is opened, exposing the medial surface of the 
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nasal turbinates. Under fluorescent illumination, the GFP clearly marked the pattern of viral 
infection and olfactory receptor expression. Odorant substance are applied to the olfactory 
epithelium in the vapor phase by injecting a pressurized pulse of odorant vapor into a continuous 
stream of humidified clean air. Electro-olfactogram recordings are obtained with a glass capillary 
5 electrode placed on the surface of the epithelium and connected to a differential amplifier. The 

olfactory receptor specificity is assessed from the increase of response in infected animals compared 
to uninfected animals. To account for the variability between animals, a standard odorant to which 
all other odorant responses are normalized is used. 

A third screening method embodiment consists of a method for the screening of ligand 
10 molecules interacting with an olfactory receptor polypeptide selected from the group consisting of 
the polypeptides comprising the amino acid sequences SEQ ID Nos 12-21, wherein said method 
comprises : 

a) providing an adenovirus containing a nucleic acid encoding a polypeptide selected 
from the group comprising, consisting essentially of, or consisting of the polypeptides 

15 comprising the amino acid sequences SEQ ID Nos 12-21, or variants or fragments thereof; 

b) infecting an olfactory epithelium with said adenovirus; 

c) bringing into contact the olfactory epithelium b) with a selected ligand molecule; 

and 

d) detecting the increase of the response to said ligand molecule. 

20 G. METHODS FOR INHIBITING THE EXPRESSION OF AN OLFACTORY 

RECEPTOR GENE 

Other therapeutic compositions according to the present invention comprise advantageously 
an oligonucleotide fragment of the nucleic sequence of olfactory receptor as an antisense tool or a 
triple helix tool that inhibits the expression of the corresponding olfactory receptor gene. A 
25 prefenred fragment of the nucleic sequence of olfactory receptor comprises an allele of at least one of 
the biallelic markers Al to A13. 

Antisense Approach 

Preferred methods using antisense polynucleotide according to the present invention are the 
procedures described by Sczakiel et aL(1995). 
30 Preferred antisense polynucleotides are described in the section entitled "Nuclear Antisense 

DNA Constructs". 

The antisense nucleic acids should have a length and melting temperature sufficient to 
permit formation of an intracellular duplex having sufficient stability to inhibit the expression of the 
olfactory receptor mRNA in the duplex. Strategies for designing antisense nucleic acids suitable for 
35 use in gene therapy are disclosed in Green et al., (1986) and Izant and Weintraub, (1984). 
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In some strategies, antisense molecules are obtained by reversing the orientation of the 
olfactory receptor coding region with respect to a promoter so as to transcribe the opposite strand 
from that which is normally transcribed in the cell. The antisense molecules may be transcribed 
using in vitro transcription systems such as those which employ T7 or SP6 polymerase to generate 
5 the transcript. Another approach involves transcription of olfactory receptor antisense nucleic acids 
in vivo by operably linking DNA containing the antisense sequence to a promoter in a suitable 
expression vector. 

Alternatively, suitable antisense strategies are those described by Rossi et al.(1991), in the 
International Apphcations Nos. WO 94/23026, WO 95/04141, WO 92/18522 and in the European 

10 Patent Application No. EP 0 572 287 A2 

An alternative to the antisense technology that is used according to the present invention 
comprises using ribozymes that will bind to a target sequence via their complementary 
polynucleotide tail and that will cleave the corresponding RNA by hydrolyzing its target site 
(namely "hammerhead ribozymes")- Briefly, the simplified cycle of a hammerhead ribozyme 

15 comprises (1) sequence specific binding to the target RNA via complementary antisense sequences; 
(2) site-specific hydrolysis of the cleavable motif of the target strand; and (3) release of cleavage 
products, which gives rise to another catalytic cycle. Indeed, the use of long-chain antisense 
polynucleotide (at least 30 bases long) or ribozymes with long antisense arms are advantageous. A 
preferred delivery system for antisense ribozyme is achieved by covalently linking these antisense 

20 ribozymes to lipophilic groups or to use liposomes as a convenient vector. Preferred antisense 
ribozymes according to the present invention are prepared as described by Sczakiel et al.(1995), the 
specific preparation procedures being referred to in said article. 

Triple Helix Approach 

The olfactory receptor genomic DNA may also be used to inhibit the expression of the 
25 olfactory receptor gene based on intracellular triple helix formation. 

Triple helix oligonucleotides are used to inhibit transcription from a genome. They are 
particularly useful for studying alterations in cell activity when it is associated with a particular 
gene. 

Similarly, a portion of the olfactory receptor genomic DNA can be used to study the effect 
30 of inhibiting olfactory receptor transcription within a cell. Traditionally, homopurine sequences 

were considered the most useful for triple helix strategies. However, homopyrimidine sequences can 
also inhibit gene expression. Such homopyrimidine oligonucleotides bind to the major groove at 
homopurineihomopyrimidine sequences. Thus, both types of sequences from the olfactory receptor 
genomic DNA are contemplated within the scope of this invention. 
35 To carry out gene therapy strategies using the triple helix approach, the sequences of the 

olfactory receptor genomic DNA are first scanned to identify 10-mer to 20-mer homopyrimidine or 
homopurine stretches which could be used in triple-helix based strategies for inhibiting olfactory 
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receptor expression. Following identification of candidate homopyrimidine or homopurine 
stretches, their efficiency in inhibiting olfactory receptor expression is assessed by introducing 
varying amounts of oligonucleotides containing the candidate sequences into tissue culture cells 
which express the olfactory receptor gene. 
5 The oligonucleotides can be introduced into the cells using a variety of methods known to 

those skilled in the art, including but not limited to calcium phosphate precipitation, DEAE-Dextran, 
electroporation, liposome-mediated transfection or native uptake. 

Treated cells are monitored for altered cell function or reduced olfactory receptor expression 
using techniques such as Northern blotting, RNase protection assays, or PCR based strategies to 
10 monitor the transcription levels of the olfactory receptor gene in cells which have been treated with 
the oligonucleotide. 

The oligonucleotides which are effective in inhibiting gene expression in tissue culture cells 
may then be introduced in vivo using the techniques described above in the antisense approach at a 
dosage calculated based on the in vitro results, as described in antisense approach. 
15 In some embodiments, the natural (beta) anomers of the oligonucleotide units can be 

replaced with alpha anomers to render the oligonucleotide more resistant to nucleases. Further, an 
intercalating agent such as ethidium bromide, or the like, can be attached to the 3' end of the alpha 
oligonucleotide to stabilize the triple helix. For information on the generation of oligonucleotides 
suitable for triple helix formation see Griffin et al.(1989). 

20 H. COMPUTER-RELATED EMBODIMENTS 

As used herein the term "nucleic acid codes of the invention" encompass the nucleotide 
sequences comprising, consisting essentially of, or consisting of any of the polynucleotides 
described in the "Coding Regions of the olfactory receptor gene" section, "Genomic sequence of the 
olfactory receptor gene" section and the "Oligonucleotide Probes And Primers" section, or variants 

25 thereof, or complementary sequences thereto. Homologous sequences refer to a sequence having at 
least 99%, 98%, 97%, 96%, 95%, 90%, 85%, 80%, or 75% homology to these contiguous spans. 
Homology may be determined using any method described herein, including BLAST2N with the 
default parameters or with any modified parameters. Homologous sequences also may include RNA 
sequences in which uridines replace the thymines in the nucleic acid codes of the invention. 

30 As used herein the term "polypeptide codes of the invention" encompass the polypeptide 

sequences comprising any of the polypeptides described in the " OLFl to OFLIO proteins and 
polypeptide fragments". 

It will be appreciated that the nucleic acid and polypeptide codes of the invention can be 
represented in the traditional single character format or three letter format respectively (See the inside 

35 back cover of Stryer, Lubert. Biochemistry, 3"* edition. W. H Freeman & Co., New York.) or in any 
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other format or code which records the identity of the nucleotides or the amino acid respectively in a 
sequence. 

It will be appreciated by those skilled in the art that the nucleic acid codes of the invention and 
polypeptide codes of the invention can be stored, recorded, and manipulated on any medium which can 
5 be read and accessed by a computer. As used herein, the words ''recorded" and ''stored" refer to a 
process for storing information on a computer medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on a computer readable medium to generate 
manufactures comprising one or more of the nucleic acid codes of the invention, or one or more of the 
polypeptide codes of the invention. Another aspect of the present invention is a computer readable 

10 medium having recorded thereon at least 2, 5, 10, 15, 20, 25, 30, or 50 nucleic acid codes of the 
invention. Another aspect of the present invention is a computer readable medium having recorded 
thereon at least 2, 5, 10, 15, 20, 25, 30, or 50 polypeptide codes of the invention. 

Computer readable media include magnetically readable media, optically readable media, 
electronically readable media and magnetic/optical media. For example, the computer readable media 

15 may be a hard disc, a floppy disc, a magnetic tape, CD-ROM, DVD, RAM, or ROM as well as other 
types of other media known to those skilled in the art. 

Embodiments of the present invention include systems, particularly computer systems which 
contain the sequence information described herein. As used herein, "a computer system" refers to the 
hardware components, software components, and data storage components used to store and/or analyze 

20 the nucleotide sequences of the nucleic acid codes of the invention, the amino acid sequences of the 
polypeptide codes of the invention, or other sequences. The computer system preferably includes the 
computer readable media described above, and a processor for accessing and manipulating the sequence 
data. 

In some embodiments, the computer system may further comprise a sequence comparer for 
25 comparing the nucleic acid codes or polypeptide codes of the invention stored on a computer readable 
medium to reference nucleotide sequences stored on a computer readable medium. A "sequence 
comparer" refers to one or more programs which are implemented on the computer system to compare a 
nucleotide or polypeptide sequence with other nucleotide or polypeptide sequences and/or compounds 
including but not limited to peptides, peptidomimetics, and chemicals the sequences or structures of 
30 which are stared within the data storage means. For example, the sequence comparer may compare the 
nucleotide sequences of the nucleic acid codes of the invention or the amino acid sequences of the 
polypeptide codes of the invention stored on a computer readable medium to reference sequences stored 
on a computer readable medium to identify homologies, motifs implicated in biological function, or 
structural motifs. The various sequence comparer programs identified elsewhere in this patent 
35 specification are particularly contemplated for use in this aspect of the invention. 

Accordingly, one aspect of the present invention is a computer system comprising a 
processor, a data storage device having stored thereon a nucleic acid code of the invention or a 
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polypeptide code of the invention, a data storage device having retnevably stored thereon reference 
nucleotide sequences or polypeptide sequences to be compared to the nucleic acid code of the 
invention or polypeptide code of the invention and a sequence comparer for conducting the 
comparison. The sequence comparer may indicate a homology level between the sequences 
5 compared or identify structural motifs in the nucleic acid code of the invention and polypeptide 
codes of the invention or it may identify structural motifs in sequences which are compared to these 
nucleic acid codes and polypeptide codes. In some embodiments, the data storage device may have 
stored thereon the sequences of at least 2, 5, 10, 15, 20, 25, 30, or 50 of the nucleic acid codes of the 
invention or polypeptide codes of the invention. 

10 Another aspect of the present invention is a method for determining the level of homology 

between a nucleic acid code of the invention and a reference nucleotide sequence, comprising the 
steps of reading the nucleic acid code and the reference nucleotide sequence through the use of a 
computer program which determines homology levels and determining homology between the nucleic 
acid code and the reference nucleotide sequence with the computer program. The computer program 

1 5 may be any of a number of computer programs for determining homology levels, including those 

specifically enumerated herein, including BLAST2N with the default parameters or with any modified 
parameters. The method may be implemented using the computer systems described above. The 
method may also be performed by reading 2, 5, 10, 15, 20, 25, 30, or 50 of the above described nucleic 
acid codes of the invention through the use of the computer program and determining homology 

20 between the nucleic acid codes and reference nucleotide sequences. 

Alternatively, the computer program may be a computer program which compares the 
nucleotide sequences of the nucleic acid codes of the present invention, to reference nucleotide 
sequences in order to determine whether the nucleic acid code of the invention differs from a reference 
nucleic acid sequence at one or more positions. Optionally such a program records the length and 

25 identity of inserted, deleted or substituted nucleotides with respect to the sequence of either the 

reference polynucleotide or the nucleic acid code of the invention. In one embodiment, the computer 
program may be a program which determines whether the nucleotide sequences of the nucleic acid 
codes of the invention contain one or more single nucleotide polymorphisms (SNP) with respect to a 
reference nucleotide sequence. These single nucleotide polymorphisms may each comprise a single 

30 base substitution, insertion, or deletion. 

Another aspect of the present invention is a method for determining the level of homology 
between a polypeptide code of the invention and a reference polypeptide sequence, comprising the 
steps of reading the polypeptide code of the invention and the reference polypeptide sequence through 
use of a computer program which determines homology levels and determining homology between the 

35 polypeptide code and the reference polypeptide sequence using the computer program. 

Accordingly, another aspect of the present invention is a method for determining whether a 
nucleic acid code of the invention differs at one or more nucleotides from a reference nucleotide 
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sequence comprising the steps of reading the nucleic acid code and the reference nucleotide 
sequence through use of a computer program which identifies differences between nucleic acid 
sequences and identifying differences between the nucleic acid code and the reference nucleotide 
sequence with the computer program. In some embodiments, the computer program is a program 
5 which identifies single nucleotide polymorphisms. The method may be implemented by the 

computer systems described above. The method may also be performed by reading at least 2, 5, 10, 
15, 20, 25, 30, or 50 of the nucleic acid codes of the invention and the reference nucleotide 
sequences through the use of the computer program and identifying differences between the nucleic 
acid codes and the reference nucleotide sequences with the computer program. 

10 An "identifier" refers to one or more programs which identifies certain features within the 

above-described nucleotide sequences of the nucleic acid codes of the invention or the amino acid 
sequences of the polypeptide codes of the invention. 

In one embodiment, the identifier may comprise a molecular modeling program which 
determines the 3-dimensional structure of the polypeptides codes of the invention. In some 

15 embodiments, the molecular modeling program identifies target sequences that are most compatible 
with profiles representing the structural environments of the residues in known three-dimensional 
protein structures. (See, e.g., Eisenberg et al., U.S. Patent No. 5,436,850 issued July 25, 1995). In 
another technique, the knowoi three-dimensional structures of proteins in a given family are 
superimposed to define the structurally conserved regions in that family. This protein modeling 

20 technique also uses the knowni three-dimensional structure of a homologous protein to approximate 
the structure of the polypeptide codes of the invention. (See e.g., Srinivasan, et aL, U.S. Patent 
No. 5,557,535 issued September 17, 1996). Conventional homology modeling techniques have been 
used routinely to build models of proteases and antibodies. (Sowdhamini et al., (1997)). 
Comparative approaches can also be used to develop three-dimensional protein models when the 

25 protein of interest has poor sequence identity to template proteins. In some cases, proteins fold into 
similar three-dimensional structures despite having very weak sequence identities. For example, the 
three-dimensional structures of a number of helical cytokines fold in similar three-dimensional 
topology in spite of weak sequence homology. 

The recent development of threading methods now enables the identification of likely 

30 folding patterns in a number of situations where the structural relatedness between target and 
template(s) is not detectable at the sequence level. Hybrid methods, in which fold recognition is 
performed using Multiple Sequence Threading (MST), structural equivalencies are deduced from the 
threading output using a distance geometry program DRAGON to construct a low resolution model, 
and a full-atom representation is constructed using a molecular modeling package such as 

35 QUANTA. According to this 3-step approach, candidate templates are first identified by using the 
novel fold recognition algorithm MST, which is capable of performing simultaneous threading of 
multiple aligned sequences onto one or more 3-D structures. In a second step, the structural 



wo 00/21985 PCT/IB99/01729 

77 

equivalencies obtained from the MST output are converted into interresidue distance restraints and 
fed into the distance geometry program DRAGON, together with auxihary information obtained 
from secondary structure predictions. The program combines the restraints in an unbiased manner 
and rapidly generates a large number of low resolution model confirmations. In a third step, these 
5 low resolution model confirmations are converted into full-atom models and subjected to energy 
minimization using the molecular modeling package QUANTA. (See e.g., Aszodi et al., (1997)). 

he results of the molecular modeling analysis may then be used in rational drug design 
techniques to identify agents which modulate the activity of the polypeptide codes of the invention. 
Accordingly, another aspect of the present invention is a method of identifying a feature 

10 within the nucleic acid codes of the invention or the polypeptide codes of the invention comprising 
reading the nucleic acid code(s) or the polypeptide code(s) through the use of a computer program 
which identifies features therein and identifying features within the nucleic acid code(s) or 
polypeptide code(s) with the computer program. In one embodiment, computer program comprises a 
computer program which identifies open reading frames. In a further embodiment, the computer 

15 program identifies structural motifs in a polypeptide sequence. In another embodiment, the 

computer program comprises a molecular modeling program. The method may be performed by 
reading a single sequence or at least 2, 5, 10, 15, 20, 25, 30, or 50 of the nucleic acid codes of the 
invention or the polypeptide codes of the invention through the use of the computer program and 
identifying features within the nucleic acid codes or polypeptide codes with the computer program. 

20 The nucleic acid codes of the invention or the polypeptide codes of the invention may be 

stored and manipulated in a variety of data processor programs in a variety of formats. For example, 
they may be stored as text in a word processing file, such as MicrosoftWORD or WORDPERFECT or 
as an ASCII file in a variety of database programs familiar to those of skill in the art, such as DB2, 
SYBASE, or ORACLE. In addition, many computer programs and databases may be used as sequence 

25 comparers, identifiers, or sources of reference nucleotide or polypeptide sequences to be compared to 
the nucleic acid codes of the invention or the polypeptide codes of the invention. The following list is 
intended not to limit the invention but to provide guidance to programs and databases which are useful 
with the nucleic acid codes of the invention or the polypeptide codes of the invention. The programs 
and databases which may be used include, but are not limited to: MacPattem (EMBL), DiscoveryBase 

30 (Molecular Applications Group), GeneMine (Molecular Applications Group), Look (Molecular 
Applications Group), MacLook (Molecular Applications Group), BLAST and BLAST2 (NCBI), 
BLASTN and BLASTX (Altschul et al, 1990), FASTA (Pearson and Lipman, 1988), FASTDB 
(Brutlag et al., 1990), Catalyst (Molecular Simulations Inc.), Catalyst/SHAPE (Molecular Simulations 
Inc.), Cerius^-DBAccess (Molecular Simulations he), HypoGen (Molecular Simulations Inc.), Insight 

35 U, (Molecular Simulations Inc.), Discover (Molecular Simulations Inc.), CHARMm (Molecular 
Simulations Inc.), FeHx (Molecular Simulations Inc.), DelPhi, (Molecular Simulations Jnc.% 
QuanteMM, (Molecular Simulations Inc.), Homology (Molecular Simulations Inc.), Modeler 
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(Molecular Simulations Inc.), ISIS (Molecular Simulations Inc.), Quanta/Protein Design (Molecular 
Simulations Inc.), WebLab (Molecular Simulations Inc.), WebLab Diversity Explorer (Molecular 
Simulations Inc.), Gene Explorer (Molecular Simulations Inc.), SeqFold (Molecular Simulations Inc.), 
the EMBL/Swissprotein database, the MDL Available Chemicals Directory database, the MDL Drug 
5 Data Report data base, the Comprehensive Medicinal Chemistry database, Derwents's World Drug 
Index database, the BioByteMasterFile database, the Genbank database, and the Genseqn database. 
Many other programs and data bases would be apparent to one of skill in the art given the present 
disclosure. 

Motifs which may be detected using the above programs include sequences encoding 
10 leucine zippers, helix-tum-helix motifs, glycosylation sites, ubiquitination sites, alpha helices, and 
beta sheets, signal sequences encoding signal peptides which direct the secretion of the encoded 
proteins, sequences implicated in transcription regulation such as homeoboxes, acidic stretches, 
enzymatic active sites, substrate binding sites, and enzymatic cleavage sites. 

15 Throughout this application, various publications, patents and published patent applications 

are cited. The disclosures of these publications, patents and published patent specification 
referenced in this application are hereby incorporated by reference into the present disclosure to 
more fully describe the sate of the art to which this invention pertains. 

EXAMPLES 

20 EXAMPLE 1 : LOCALIZATION OF THE OLFACTORY RECEPTOR GENE OLF3 
AND OLF5 ON THE HUMAN CHROMOSOMES. 

Metaphase chromosome preparation 

Metaphase chromosomes were prepared from phytohemagglutinin (PHA)-stimulated blood 
cell donors. PHA stimulated lymphocytes from healthy males were cultured for 72 h in RPMI-1640 

25 medium. For synchronization, methotrexate (10 |liM) was added for 17 h, followed by addition of 5- 
bromodeoxyuridine (5-BrdU, 0.1 mM) for 6 h. Colcemid (1 mg/ml) was added for the last 15 min 
before harvesting the cells. Cells were collected, washed in RPMI, incubated with a hypotonic 
solution of KCl (75 mM) at 31^C for 15 min and fixed in three changes of methanohacid acetic 
(3:1). The cell suspension was dropped onto a glass slide, air-dried and kept in darkness at — 20°C 

30 until use. 

Probes: 

- The BAG H0526H04 containing Olfi and 01f5 genes was used to generate probe by Alu- 
PCR. PGR amplification of BAG recombinant DNA (50 ng) was carried out as described by Romana 
etal.(1993). 
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- Two DNA fragments carrying respectively 01f3 and OlfS sequences were generated by 
long range PCR with specific primers (SEQ ID 96-99) and used as probes to confirm the locahzation 
of each genes. OIf3 and OlfS amplicons are respectively 2.8 kb and 3.2 kb fragments. 

Probes were labeled by nick translation with bio-16-dUTP (Boehringer Mannheim), and 
5 purified over a Sephadex G50 column. 

Fluorescence In Situ Hybridization 

To determine the chromosomal localization of both genes, the BAC probe was initially 
hybridized to human metaphase cells. When biotinylated PCR products of BAC DNA were used in 
hybridization experiment, 75 ng of probe was precipitated with 75 ixg of competitor DNA (human 

10 Cotl DNA, GIBCO-BRL) and resuspended in 10 |li1 of hybridization buffer (50% formamide, 2 X 
SSC, 10% dextran sulfate, 1 mg/ml sonicated herring DNA, pH 7). When long range PCR products 
of 01f3 or 01f5 genes were used as probe, 5 ng of biotinylated probe were mixed with 5 fig of 
human Cotl DNA. Prior to hybridization, the probe was denatured at 70°C for 10 min and 
preannealed at 37°C for 2 h. 

15 Slides were treated for 1 h at 37''C with Rnase A (100 |ig/ml), rinsed three times in 2 X SSC 

and dehydrated in an ethanol serie. Chromosome preparations were denatured in 70% formamide, 2 
X SSC (pH 7), for 2 min at 70°C, then dehydrated at 4''C. The slides were treated with proteinase K 
(10 ^g/ml in 20 mM Tris-HCl, 2 mM CaC12) at 37°C for 8-10 min and dehydrated. After 
preannealing, the hybridization mixture containing the probe was placed on the slide, covered with a 

20 coverslip, sealed with rubber cement and incubated overnight in a humid chamber at 37°C. After 
hybridization and post hybridization washes, the biotinylated probe was detected by avidin-FITC (5 
fag/ml, Vector Laboratories) and amplified once with additional layers of biotinylated goat anti- 
avidin (5 p.g/ml, Vector Laboratories) and avidin-FITC. For chromosomal localization, fluorescent 
R-Bands were obtained as described by Cherif et al. (1990). The slides were observed under a 

25 LEICA fluorescent microscope (DMRXA). Chromosomes were counterstained with propidium 
iodide and the fluorescent signal of the probe appeared as two symmetrical yellow-green spots on 
both chromatids of the fluorescent R-band chromosome. 

Localization 

A specific signal (a double yellow-green spot) was observed on band 1 Iql2-ql3 on at least 
30 on chromosome 1 1 in >80% of the metaphases with all the probes, 

EXAMPLE 2 : IDENTIFICATION OF BIALLELIC MARKERS: DNA 
EXTRACTION 

Donors were unrelated and healthy. They presented a sufficient diversity for being representative of a 
French heterogeneous population. The DNA from 100 individuals was extracted and tested fcM* the 
3 5 detection of the biallelic markers. 
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30 ml of peripheral venous blood were taken from each donor in the presence of EDTA. 
Cells (pellet) were collected after centrifugation for 10 minutes at 2000 rpm. Red cells were lysed by 
a lysis solution (50 ml final volume : 10 mM Tris pH7.6; 5 mM MgCh; 10 mM NaCl). The solution 
was centrifuged (10 minutes, 2000 rpm) as many times as necessary to eliminate the residual red 
5 cells present in the supernatant, after resuspension of the pellet in the lysis solution. 

The pellet of white cells was lysed overnight at 42°C with 3.7 ml of lysis solution composed 

of: 

- 3 ml TE 10-2 (Tris-HCl 10 mM, EDTA 2 mM) / NaCl 0.4 M 
-200 SDS 10% 

10 - 500 |il K-proteinase (2 mg K-proteinase in TE 10-2 / NaCl 0.4 M). 

For the extraction of proteins, 1 ml saturated NaCl (6M) (1/3.5 v/v) was added. After 
vigorous agitation, the solution was centrifuged for 20 minutes at 10000 rpm. 

For the precipitation of DNA, 2 to 3 volumes of 100% ethanol were added to the previous 
supernatant, and the solution was centrifuged for 30 minutes at 2000 rpm. The DNA solution was 
15 rinsed three times with 70% ethanol to ehminate salts, and centrifuged for 20 minutes at 2000 rpm. 
The pellet was dried at 37°C, and resuspended in 1 ml TE 10-1 or 1 ml water. The DNA 
concentration was evaluated by measuring the OD at 260 nm ( 1 unit OD = 50 |Lig/ml DNA). 

To determine the presence of proteins in the DNA solution, the OD 260 / OD 280 ratio was 
determined. Only DNA preparations having a OD 260 / OD 280 ratio between 1 .8 and 2 were used 
20 in the subsequent examples described below. 

The pool was constituted by mixing equivalent quantities of DNA from each individual. 

EXAMPLE 3 : IDENTIFICATION OF BIALLELIC MARKERS: AMPLIFICATION 
OF GENOMIC DNA BY PCR 

The amplification of specific genomic sequences of the DNA samples of example 2 was 
25 carried out on the pool of DNA obtained previously. In addition, 50 individual samples were 



similarly amplified. 

PCR assays were performed using the following protocol: 

Final volume 25 |li1 

DNA 2 ng/^1 

30 MgCl2 2mM 

dNTP (each) 200 |iiM 

primer (each) 2.9 ng/^il 

Ampli Taq Gold DNA polymerase 0.05 unit/^il 

PCR buffer (lOx = 0.1 M TrisHCl pH8.3 0.5M KCl Ix 



35 Each pair of first primers was designed using the sequence information of the olfactory 

receptor gene cluster disclosed herein and the OSP software (Hillier & Green, 1991). This first pair 
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of primers was about 20 nucleotides in length and had the sequences disclosed in Table 1 in the 
columns labeled PU and RP. 



Table 1 



Amplicon 


Position range of 
the amplicon in 
SEQ ID 1 


Primer 
name 
RP 


Position range of 

amplification 
primer in SEQ ID 
Nol 


Primer 
name 

PIT 


Complementary 
position range of 

amplification 
primer in SEQ ID 
Nol 


99-13670 


7362 


7824 


Bl 


7362 


7380 


CI 


7805 


7824 


99-13669 


8120 


8662 


B2 


8120 


8140 


C2 


8643 


8662 


99-13666 


14308 


14757 


B3 


14308 


14328 


C3 


14740 


14757 


99-13664 


19346 


19845 


B4 


19346 


19366 


C4 


19826 


19845 


99-13663 


20298 


20800 


B5 


20298 


20318 


C5 


20781 


20800 


99-13660 


76752 


77223 


B6 


76752 


76772 


C6 


77205 


77223 


99-13652 


90967 


91494 


B7 


90967 


90987 


C7 


91474 


91494 


99-13671 


133925 


134393 


B8 


133925 


133945 


C8 


134375 


134393 


99-13649 


139807 


140351 


B9 


139807 


139826 


C9 


140331 


140351 


99-13648 


140912 


141434 


BIO 


140912 


140932 


CIO 


141416 


141434 


99-13647 


143828 


144309 


Bll 


143828 


143847 


Cll 


144292 


144309 



5 Preferably, the primers contained a common oligonucleotide tail upstream of the specific 

bases targeted for amplification which was useful for sequencing. 

Primers PU contain the following additional PU 5' sequence : 
TGTAAAACGACGGCCAGT; primers RP contain the following RP 5' sequence : 
CAGGAAACAGCTATGACC. The primer containing the additional PU 5' sequence is listed in 
10 SEQ ID No 26. The primer containing the additional RP 5' sequence is listed in SEQ ID No 27. 

The synthesis of these primers was performed following the phosphoramidite method, on a 
GENSET UFPS 24.1 synthesizer. 

DNA amplification was performed on a Genius II thermocycler. After heating at 9S^C for 10 
min, 40 cycles were performed. Each cycle comprised: 30 sec at 95°C, 54°C for 1 min, and 30 sec at 
15 72°C. For final elongation, 10 min at 72°C ended the amplification. The quantities of the 

amplification products obtained were determined on 96-well microtiter plates, using a fluorometer 
and Picogreen as intercalant agent (Molecular Probes). 

EXAMPLE 4 : IDENTIFICATION OF BIALLELIC MARKERS: SEQUENCING OF 
AMPLIFIED GENOMIC DNA AND IDENTIFICATION OF POLYMORPHISMS. 

20 The sequencing of the amplified DNA obtained in example 3 was carried out on ABI 377 

sequencers. The sequences of the amplification products were determined using automated dideoxy 
terminator sequencing reactions with a dye terminator cycle sequencing protocol. The products of 
the sequencing reactions were run on sequencing gels and the sequences were determined using gel 
image analysis (ABI Prism DNA Sequencing Analysis software (2.1 .2 version)). 
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The sequence data were further evaluated using the above mentioned polymorphism analysis 
software designed to detect the presence of biallelic markers among the pooled amplified fragments. 
The polymorphism search was based on the presence of superimposed peaks in the electrophoresis 
pattern resulting from different bases occurring at the same position as described previously. 
5 11 fragments of amplification were analyzed. In these segments, 1 3 biallelic markers 

referred to as A 1 to A13 in the BM column were detected. The localization of these biallelic markers 



is as shown in Table 2. 

Table 2 



Amplicon 


BM 


Marker Name 


Localization in OLF 
gene cluster 


Polymor- 
phism 


BM position in 
SEQ ID No 1 


99-13670 


Al 


99-13670-305 


Between Orfl and Ori2 


A/C 


7521 


99-13669 


A2 


99-13669-471 


Between Orfl and OrfZ 


A/C 


8192 


99-13666 


A3 


99-13666-275 


Between Orf2 and Orf3 


A/T 


14483 


99-13664 


A4 


99-13664-221 


Between Orf2 and Orf3 


A/G 


19625 


99-13663 


A5 


99-13663-218 


Between Orf2 and Orf3 


C/T 


20583 


99-13660 


A6 


99-13660-277 


Between Orf4 and OrfS 


G/T 


76947 


99-13652 


A7 


99-13652-407 


Between OrfS and Orf6 


G/C 


91088 


99-13652 


A8 


99-13652-357 


Between OrfS and Orf6 


C/T 


91138 


99-13652 


A9 


99-13652-308 


Between OrfS and Orf6 


C/T 


91187 


99-13671 


AlO 


99-13671-396 


Between Orf9 and 
OrHO 


C/T 


133998 


99-13649 


All 


99-13649-286 


Between Orf9 and 
OrflO 


A/G 


140066 


99-13648 


A12 


99-13648-259 


Between Orf9 and 
OrflO 


C/T 


141176 


99-13647 


A13 


99-13647-278 


After OrflO 


C/T 


144033 



10 Table 3 



BM 


Marker Name 


Position range of 
probes in SEQ ID 
No 1 


Probes 


Al 


99-13670-305 


7498 


7544 


PI 


A2 


99-13669-471 


8169 


8215 


P2 


A3 


99-13666-275 


14460 


14506 


P3 


A4 


99-13664-221 


19602 


19648 


P4 


A5 


99-13663-218 


20560 


20606 


P5 


A6 


99-13660-277 


76924 


76970 


P6 


A7 


99-13652-407 


91065 


91111 


P7 


A8 


99-13652-357 


91115 


91161 


P8 


A9 


99-13652-308 


91164 


91210 


P9 


AlO 


99-13671-396 


133975 


134021 


PIO 


All 


99-13649-286 


140043 


140089 


Pll 


A12 


99-13648-259 


141153 


141199 


P12 


A13 


99-13647-278 


144010 


144056 


P13 
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EXAMPLE 5 : VALIDATION OF THE POLYMORPHISMS THROUGH 
MICROSEQUENCING 

The biallelic markers identified in example 4 were further confirmed and their respective 
firequencies were determined through microsequencing. Microsequencing was carried out for each 
5 individual DNA sample described in Example 2. 

Amplification from genomic DNA of individuals was performed by PGR as described above 
for the detection of the biallelic markers with the same set of PGR primers (Table 1). 

The preferred primers used in microsequencing were about 19 nucleotides in length and 
hybridized just upstream of the considered polymorphic base. According to the invention, the 
10 primers used in microsequencing are detailed in Table 4. 

Table 4 



Marker Name 


BM 


Mis. 1 


Position range of 
microsequencing 
primer mis 1 in 
SEQ ID No 1 


Mis. 2 


Complementary position 

range of 
microsequencing primer 
mis. 2 in SEQ ID No 1 


99-13670-305 


Al 


Dl 


7502 


7520 


El 


7522 


7540 


99-13669-471 


A2 


D2 


8173 


8191 


E2 


8193 


8211 


99-13666-275 


A3 


D3 


14464 


14482 


E3 


14484 


14502 


99-13664-221 


A4 


D4 


19606 


19624 


E4 


19626 


19644 


99-13663-218 


A5 


D5 


20564 


20582 


E5 


20584 


20602 


99-13660-277 


A6 


D6 


76928 


76946 


E6 


76948 


76966 


99-13652-407 


A7 


D7 


91069 


91087 


E7 


91089 


91107 


99-13652-357 


A8 


D8 


91119 


91137 


E8 


91139 


91157 


99-13652-308 


A9 


D9 


91168 


91186 


E9 


91188 


91206 


99-13671-396 


AlO 


DIO 


133979 


133997 


ElO 


133999 


134017 


99-13649-286 


All 


Dll 


140047 


140065 


Ell 


140067 


140085 


99-13648-259 


A12 


D12 


141157 


141175 


E12 


141177 


141195 


99-13647-278 


A13 


D13 


144014 


144032 


E13 


144034 


144052 



Mis 1 and Mis 2 respectively refer to microsequencing primers which hybridized with the 
non-coding strand of the olfactory receptor gene or with the coding strand of the olfactory receptor 
15 gene. 

The microsequencing reaction was performed as follows : 

After purification of the amplification products, the microsequencing reaction mixture was 
prepared by adding, in a 20|j.l final volume: 10 pmol microsequencing oligonucleotide, 1 U 
Thermosequenase (Amersham E79000G), 1.25 |il Thermosequenase buffer (260 mM Tris HCl pH 

20 9.5, 65 mM MgCl2), and the two appropriate fluorescent ddNTPs (Perkin Elmer, Dye Terminator Set 
401095) complementary to the nucleotides at the polymorphic site of each biallelic marker tested, 
following the manufacturer's recommendations. After 4 minutes at 94°C, 20 PGR cycles of 15 sec at 
55^C, 5 sec at ITC, and 10 sec at 94^C were canried out in a Tetrad PTC-225 thermocycler (MJ 
Research). The unincorporated dye termmators were then removed by ethanol precipitation. Samples 

25 were finally resuspended in formamide-EDTA loading buffer and heated for 2 min at 95'^C before 
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being loaded on a polyacrylamide sequencing gel. The data were collected by an ABI PRISM 377 
DNA sequencer and processed using the GENESCAN software (Perkin Elmer). 

Following gel analysis, data were automatically processed with software that allows the 
determination of the alleles of biallelic markers present in each amplified fragment. 
5 The software evaluates such factors as whether the intensities of the signals resulting from 

the above microsequencing procedures are weak, normal, or saturated, or whether the signals are 
ambiguous. In addition, the software identifies significant peaks (according to shape and height 
criteria). Among the significant peaks, peaks corresponding to the targeted site are identified based 
on their position. When two significant peaks are detected for the same position, each sample is 
10 categorized classification as homozygous or heterozygous type based on the height ratio. 

While the preferred embodiment of the invention has been illustrated and described, it will 
be appreciated that various changes can be made therein by the one skilled in the art without 
departing from the spirit and scope of the invention. 
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SEQUENCE LISTING FREE TEXT 

The following free text appears in the accompanying Sequence Listing : 

open reading frame 

ubiquitin 1 pseudogene complement 

ubiquitin 2 pseudogene complement 

pol>TOorphic base 

or 

complement 
probe 

sequencing oligonucleotide PrimerPU 
sequencing oligonucleotide PrimerRP 
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What is claimed: 



1 . An isolated, purified, or recombinant polynucleotide comprising a contiguous span of at 
least 12 nucleotides of SEQ ID No 1 or the complements thereof, wherein said contiguous span 
comprises at least 1 of the following nucleotide positions of SEQ ID No 1:1-1 13643, 1 14064- 
5 127488, 127855-144460. 



2. An isolated, purified, or recombinant polynucleotide comprising a contiguous span of at 
least 12 nucleotides of a sequence selected from the group consisting of SEQ ED Nos 2-11 or the 
complements thereof. 

10 

3. An isolated, purified, or recombinant polynucleotide consisting essentially of a 
contiguous span of 8 to 50 nucleotides of SEQ ED No 1 or the complement thereof, wherein said 
span includes an olfactory receptor-related biallelic marker in said sequence. 

15 4. A polynucleotide according to claim 3, wherein said olfactory receptor-related biallelic 

marker is selected fi'om the group consisting of Al to A 13, and the complements thereof. 

5. A polynucleotide according to claims 3 or 4, wherein said contiguous span is 18 to 47 
nucleotides in length and said biallelic marker is within 4 nucleotides of the center of said 

20 polynucleotide. 

6. A polynucleotide according to claim 5, wherein said polynucleotide consists essentially 
of a sequence selected from the following sequences: PI to PI 3, and the complementary sequences 
thereto. 

25 

7. A polynucleotide according to any one of claims 1, 2 or 3, wherein the 3' end of said 
contiguous span is present at the 3' end of said polynucleotide. 

8. A polynucleotide according to claims 3 or 4, wherein the 3* end of said contiguous span is 
30 located at the 3* end of said polynucleotide and said biallelic marker is present at the 3* end of said 

polynucleotide. 



35 



9. An isolated, purified, or recombinant polynucleotide consisting essentially of a contiguous 
span of 8 to 50 nucleotides of SEQ ID No 1 or the complement thereof, wherein the 3* end of said 
contiguous span is located at the 3* end of said polynucleotide, and wherein the 3' end of said 
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polynucleotide is located within 20 nucleotides upstream of an olfactory receptor-related biallelic 
marker in said sequence. 

10. A polynucleotide according to claim 9, wherein the 3' end of said polynucleotide is 

5 located 1 nucleotide upstream of said olfactory receptor-related biallelic marker in said sequence. 

1 1 . A polynucleotide according to claim 10, wherein said polynucleotide consists 
essentially of a sequence selected from the following sequences: Dl to D13, and El to E13. 

10 12. A polynucleotide according to claim 7 consisting essentially of a sequence selected from 

the following sequences: Bl to Bl 1 and CI to Cll. 

13. An isolated, purified, or recombinant polynucleotide which encodes a polypeptide 
comprising a contiguous span of at least 6 amino acids of a sequence selected from the group 

1 5 consisting of SEQ ID Nos 12-21. 

14. A polynucleotide for use in a genotyping assay for determining the identity of the 
nucleotide at an olfactory receptor-related biallelic marker or the complement thereof. 

20 15. A polynucleotide according to claim 14, wherein the polynucleotide is used in an assay 

selected from the group consisting of: a hybridization assay, a sequencing assay, an enzyme-based 
mismatch detection assay, and an amplification of a segment of nucleotides comprising said biallelic 
marker. 

25 16. A polynucleotide according to any one of claims 1-15 attached to a solid support. 

17. An array of polynucleotides comprising at least one polynucleotide according to claim 

16. 

30 18. An array according to claim 17, wherein said array is addressable. 

19. A polynucleotide according to any one of claims 1-15, further comprising a label. 



35 



20. 
21. 



A recombinant vector comprising a polynucleotide according to any one of claims 1-15. 
A host cell comprising a recombinant vector according to claim 20. 
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22. A non-human host animal or mammal comprising a recombinant vector according to 
claim 20. 

23. A mammalian host cell comprising an olfactory receptor gene disrupted by homologous 
5 recombination with a knock out vector, comprising a polynucleotide according to any one of claims 

1-15. 

24. A non-human host mammal comprising an olfactory receptor gene disrupted by 
homologous recombination with a knock out vector, comprising a polynucleotide according to any 

10 one of claims 1-15. 

25. An isolated, purified, or recombinant polypeptide comprising a contiguous span of at 
least 6 amino acids of a sequence selected from the group consisting of SEQ ID Nos 12-21. 

15 26. An isolated or purified antibody composition are capable of selectively binding to an 

epitope-containing fragment of a polypeptide according to claim 25. 

27. A method of genotyping comprising determining the identity of a nucleotide at an 
olfactory receptor-related biallelic marker or the complement thereof in a biological sample. 

20 

28. A method according to claim 27, wherein said biological sample is derived from a 
single subject. 

29. A method according to claim 28, wherein the identity of the nucleotides at said biallelic 
25 marker is determined for both copies of said biallelic marker present in said individual's genome. 

30. A method according to claim 27, wherein said biological sample is derived from 
multiple subjects. 

30 3 1 . A method according to claim 27, further comprising amplifying a portion of said 

sequence comprising the biallelic marker prior to said determining step. 

32. A method according to claim 31, wherein said amplifying step is performed by PCR. 

35 33. A method according to claim 27, wherein said determining is performed by an assay 

selected from the group consisting of: a hybridization assay, a sequencing assay, a microsequencing 
assay, and an enzyme-based mismatch detection assay. 
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34. A method according to claim 27 wherein said olfactory receptor-related biallelic marker 
is selected from the group consisting of Al to A13 and the complements thereof. 

5 35. A method for the screening of a candidate substance interacting with an olfactory 

receptor pol3^eptide selected from the group consisting of SEQ ID Nos 12-21, or fragments or 
variants thereof, comprises the following steps : 

a) providing a polypeptide selected from the group consisting of the sequences of SEQ ID 
Nos 12-21 , or a peptide fragment or a variant thereof; 
10 b) obtaining a candidate substance; 

c) bringing into contact said polypeptide with said candidate substance; and 

d) detecting the complexes formed between said polypeptide and said candidate substance. 

36. A method for the screening of ligand molecules interacting with an olfactory receptor 
15 polypeptide selected from the group consisting of SEQ ID Nos 12-21, wherein said method 
comprises : 

a) providing a recombinant eukaryotic host cell containing a nucleic acid encoding a 
polypeptide selected from the group consisting of the polypeptides comprising the amino acid 
sequences SEQ ID Nos 12-21; 
20 b) preparing membrane extracts of said recombinant eukaryotic host cell; 

c) bringing into contact the membrane extracts prepared at step b) with a selected ligand 
molecule; and 

d) detecting the production level of second messengers metabolites. 

25 37. A method for the screening of ligand molecules interacting with an olfactory receptor 

polypeptide selected from the group consisting of SEQ ED Nos 12-21, wherein said method 
comprises : 

a) providing an adenovirus containing a nucleic acid encoding a polypeptide selected from 
the group consisting of the polypeptides comprising the amino acid sequences SEQ ID Nos 12-21; 
30 b) infecting an olfactory epithelium with said adenovirus; 

c) bringing into contact the olfactory epithelium b) with a selected ligand molecule; and 

d) detecting the increase of the response to said ligand molecule. 
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FIGURE 1 
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TMl 



listl .msf {orf -8} 
listl .msf {orf -9} 
listl. msf {orf-7} 
listl .msf {orf -2 } 
listl .msf {orf -4 } 
listl .msf {orf -5} 
listl .msf {orf -6} 
listl.msf {orf -10} 
listl .msf {orf -3 } 
listl .msf {orf - 1 } 
Consensus 



-MRRNFTLVT 
-MRRNCTLVT 



EFILLGLraH 
EFILLGLTSR 



MFSPNHTIVT 
MLSPNHTIVT 
MSNTNGSAIT 
MVRGNSTLVT 
MSRRNYTEIiT 
MLKKNHTAVT 



EFILLGLTDD 
EFILLGLTDD 
EFILLGLTEC 
EFILLGLKDL 
EFVLLGLTSR 
EFVLLGLTDR 



QELQILLFMLi 
RELQILLFTL 



FLAIYMVTVA 
FLAIYMVTVA 



PVLEKIIiFGV 
PVLEKILFGV 
PELQSLLFVIi 
PELQPILFVL 
PELRVAFLAL 
AELQSLLFW 



FLAIYIilTLA 
FLAIYLITLA 
FLWYLVTLL 
FLLIYLITVG 
FLFVYIATW 
FLVIYLITVI 



M-R-N-T-VT EFILLGLTD- PELQ-LLF-L FLAIYLITVA 



GNLSMIALI2 
GNLGM1VLI3 

LPSSR 

GNLCMILLIR 
GNLCMILLIR 
GNLGMIMLMR 
GNLGMLVLIR 
GNLGMIILIK 
GNVSMILLIR 

MSFLIR 

GNLGMI-LIR 
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ANARLHTPMY 
ANAWLHMPMY 
PTPRLHTPMY 
TNSHLQrPMY 
TNSQLQTPMY 
LDSRLHTPMY 
IDSRLHrPMY 
VDSRLHTPM- 
SDSTLHTPMY 
SDSTLHTPMC 
-DSRLHTPMY 



FFLSHLSFLD 
FFLSHLSFVD 
FFLSNLSFVD 
FFLGHLSFVD 
FFLGHLSFLD 
FFLTNLAFVD 
FFLASLSCLD 



LCFSSNVTfK 
LCFSSNVTFK 
LCFSSNVTFR 
ICYSSNVTPISr 
ICYSSNVTXISr 
liCYTSNATFQ 
LYYSTNVTPK 



FFLSHLSFVD LCYTTNVTFQ 
LFLSHLSFVD LYYATNATFP 
FFLSHLSFVD LCYSSNVTF - 



MLEIFLSEKK 
MLEIFLSEKK 
MLEIFLSEKK 
MLHNFLSEQK 
MLHNFLSEQK 
MSTNIVSE . K 
MLVNFFSDKK 



100 

SISYPACLVQ 
SISYPACLVQ 
SISYPARLVQ 
TISYAGCFTQ 
TISYAGCFTQ 
TISFAGCFTQ 
AISYAACLVQ 



MLVNFLSKRK TISFIGCFIQ 

MLVNFFFPRE KPFPLLVALS 
ML-NFLSEKK TISYA-C-VQ 
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CYLYIILVHV 
CYLFIALVHV 
CYLFITLVHV 
CLLFIALVIT 
CLLFIALVIT 
CYIFIALLLT 
CYFFIAWIT 
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EIYILAVMAF 
EIYILAVMAF 
ELYILAVMAF 
EFYFLASMAL 
EFYFLASMAL 
EFYMLAAMAY 
EYYMLAVMAY 



FHFFIALVIT DYYMLTVMAY 

NFTFSLHW — 

CYLFIALVIT E-Y-LAVMA 



DRYMAICNPL 
DRYMAICNPL 
DRYVAICSPL 
DRYVAICSPX 
DRYVAIYDPL 
DRYVAICNPL 



LYGSRMSKSV 
LYGSRMSKSV 
HYSSRMSKKI 
HYSSRMSKJ^I 
RYSVKTSRRV 
LYSSKMSKGL 



DRYMAICKPL LYGSKMTRCV CLCLAAAPYI 



DRYVAIC-PL LYSSRMSK- 



CSFLITVPYV 
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CISLVTVPYM 
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YGALTGLMET 

YGALTGLMET 
YGFLNGLSQT 
YGXLNGLSQT 
YGFSDGLFQA 
YGFLSGLMET 



YGFANGLSTD HPDZ^SSVLLW TQ 



YGFL-GL--T 



MWT^i^NLAFCG 
MWT^LAFCG 
LLTFHLSFCG 
LLTFHLSFCG 
ILTFRLTFCR 
MWTraLTFCG 



PNEINHFYCA 

PSEINHFYCV 
SLEINHFYCA 
SLEINHFYCA 
SNVINHFYCA 
SNIINHFYCA 



DPPLIKLACS 
DPPLIKLACS 
DPPLIMLACS 
DPPLIMLACS 
DPPLIKLSCS 
DPPLIRLSCS 



DTYNKELSMF 

DTYNKE^SMF 
DTRVKKVIAMF 
DTRVKK^IAMF 
DTYVKEHAMF 
DTFIKErSMF 



--L-FCG S-EINHFYCA DPPLI-LACS DT--KE 
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FIGURE 1 (continued ) 
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IVAGWNLSFS 
WAGFNFTYP 
WAGFTLSSS 
WAGFTLSSS 
ISAGFNLSSS 
WA 



LFIICISYLY 
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LFIILLSYL 
LFIILLSYLF 
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250 



IFPAILKIRS 
IFPATLRICS 
IFAAIFRIRS 
IFAAIFRIRS 
ILAAILRIKS 



TEGRQKZ^FST 
TEGRHK^^FST 
AEGRHKZ^FST 
AEGRHK^VFST 
AEGRHK^^FST 



WAGF-LS-S L-IIL-SYL- IF-AI-RI-S -EGRHKZVFST C-SHLT-VT- 



CGSHLTAVTI 
CGSHLTAVTI 
CASHLTIVTL 
CASHLTIVTL 
CGSHMMAVTL 



listl.msf {orf -8 
listl. msf {orf-9 
listl.msf {orf-7 
listl.msf {orf -2 
listl .msf {orf -4 
listl .msf {orf -5 
listl .msf {orf -6 
listl .msf {orf -10 
listl .msf {orf -3 
listl.msf {orf -1 
Consensus 



TM6 

-251 



TM7 



FYATLFFMYL 

FYSALFFMYL 
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RRPSEESMEQ 
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GKMVAVFYTT 

GKMVAVFYTT 



SKI 



KlITAVFYTF 
lAVFYTF 
SmiAVFYTF 



VIPMLNLIIY 

VIPMLNPMIY 
LTPMLNPLIY 
LSPMLNPLIY 
VSPVLNPLIY 



RPPS--SVE- -K--AVFYT- --PMLNP-IY SL5^N-DV--A 



300 



SLRNKNVKEA 
SLRNKDVKEA 
SLRNTDVILA 
SLRNRDVILA 
SLRNKDVKQA 



listl.msf {orf-8} 
listl .msf {orf -9} 
listl .msf {orf -7 } 
listl.msf {orf -2} 
listl .msf {orf -4 } 
listl .msf {orf -5 } 
listl .msf {orf -6 } 
listl .msf {orf -10} 
listl.msf {orf -3 } 
listl.msf {orf -1} 
Consensus 



301 315 



LIKELSMKIY FS 

LCKELFKRIOi FSK — 
MQQMIRGKSF HKIAV 
IQQMIRGKSF CKIAV 
LKNVLR 



wo 00/21985 

<110> Genset SA 



1 



PCT/IB99/01729 



<120> Genes encoding olfactory receptors and biallelic markers thereof. 

<150> US 60/104 , 299 
<151> 1999-10-13 

<160> 27 

<170> Patent.pm 



<210> 1 
<211> 144460 
<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> 2406. .2600 

<223> open reading frame 1 

<220> 

<221> CDS 

<222> 9711. .10658 

<223> open reading frame 2 

<220> 
<221> CDS 

<222> 24851. .25369 

<22 3> open reading frame 3 

<220> 
<221> CDS 

<222> 45714 . . 46661 

<223> open reading frame 4 

<220> 
<221> CDS 

<222> 80198 . . 81115 

<22 3> open reading frame 5 

<220> 

<221> CDS 

<222> 96291 . . 96902 

<223> open reading frame 6 

<220> 
<221> CDS 

<222> 110758 . .111564 
<223> open reading frame 7 

<220> 

<221> CDS 

<222> 122525 . .122887 
<223> open reading frame 8 

<220> 

<221> CDS 

<222> 132454 . .133389 

<22 3> open reading frame 9 



<220> 
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<221> CDS 

<222> 143398 143577 

<22 3> open reading frame 10 

<220> 

<221> misc_f eature 
<222> 113644 114063 

<22 3> ubiquitin 1 pseudogene complement 

<220> 

<221> misc_feature 
<222> 127489 127854 

<223> ubiquitin 2 pseudogene complement 



<220> 

<221> allele 

<222> 7521 

<223> 99-13670-305 



polymorphic base G or T 



<220> 

<221> allele 

<222> 8192 

<223> 99-13669-471 



polymorphic base G or T 



<220> 

<221> allele 

<222> 14483 

<223> 99-13666-275 



polymorphic base A or T 



<220> 

<221> allele 

<222> 19625 

<223> 99-13664-221 



: polymorphic base C or T 



<220> 

<221> allele 

<222> 20583 

<223> 99-13663-218 



polymorphic base A or G 



<220> 

<221> allele 

<222> 76947 

<223> 99-13660-277 



polymorphic base A or C 



<220> 

<221> allele 

<222> 91088 

<223> 99-13652-407 



polymorphic base G or C 



<220> 

<221> allele 

<222> 91138 

<223> 99-13652-357 



polymorphic base A or G 



<220> 

<221> allele 

<222> 91187 

<223> 99-13652-308 



: polymorphic base A or G 



<220> 

<221> allele 
<222> 133998 
<223> 99-13671-396 



polymorphic base A or G 
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<220> 

<221> allele 
<222> 140066 

<223> 99-13649-286 : polymorphic base C or T 



<220> 

<221> allele 
<222> 141176 

<223> 99-13648-259 : polymorphic base A or G 



<220> 

<221> allele 

<222> 144033 

<223> 99-13647-278 : polymorphic base A or G 



<220> 

<221> primer_bind 
<222> 7362 . . 7380 
<223> 99-13670. rp 

<220> 

<221> primer_bind 

<222> 7805 . . 7824 

<223> 99-13670. pu complement 

<220> 

<221> primer_bind 
<222> 8120 . . 8140 
<223> 99-13669. rp 

<220> 

<221> primer_bind 

<222> 8643 . . 8662 

<223> 99 -13 669. pu complement 



<220> 

<221> primer_bind 
<222> 14308 . . 14328 
<223> 99-13666. rp 



<220> 

<221> primer^bind 

<222> 14740 . . 14757 

<223> 99- 13 666. pu complement 



<220> 

<221> prime r_bind 
<222> 19346 . . 19366 
<223> 99-13664. rp 



<220> 

<221> prime r_bind 

<222> 19826 . . 19845 

<223> 99-13664. pu complement 



<220> 

<221> prime r_bind 
<222> 20298 . .20318 
<223> 99-13663. rp 

<220> 

<221> prime r_bind 
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<222> 20781 . .20800 

<223> 99-13663. pu complement 

<220> 

<221> priiner_bind 
<222> 76752 . . 76772 
<223> 99-13660.rp 

<220> 

<221> primer_bind 

<222> 77205. .77223 

<223> 99-13660. pu complement 

<220> 

<221> primer__bind 
<222> 90967 . . 90987 
<223> 99-13652. rp 

<220> 

<221> primer_bind 

<222> 91474 . . 91494 

<223> 99-13652. pu complement 

<220> 

<221> primer_bind 
<222> 133925 . .133945 
<223> 99-13671. rp 

<220> 

<221> primer__bind 

<222> 134375 . .134393 

<223> 99-13671. pu complement 

<220> 

<221> primer_bind 
<222> 139807 .. 139826 
<223> 99-13649. rp 

<220> 

<221> primer_bind 

<222> 140331 140351 

<22 3> 99 -13 64 9. pu complement 

<220> 

<221> primer_bind 
<222> 140912 .. 140932 
<223> 99-13648. rp 

<220> 

<221> primer_bind 

<222> 141416 .. 141434 

<223> 99-13648. pu complement 

<220> 

<221> primer_bind 
<222> 143828 143847 
<223> 99-13647. rp 

<220> 

<221> primer_bind 

<222> 144292. .144309 

<223> 99-13647. pu complement 
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<220> 

<221> misc_binding 

<222> 7498 . . 7544 

<223> 99-13670-305 .probe 

<220> 

<221> misc_binding 

<222> 8169 . . 8215 

<223> 99-13669-471. probe 

<220> 

<221> misc_binding 
<222> 14460 . . 14506 
<223> 99-13666-275 .probe 

<220> 

<221> misc_binding 
<222> 19602 . . 19648 
<223> 99-13664-221. probe 

<220> 

<221> misc_binding 
<222> 20560 . .20606 
<223> 99-13663-218 .probe 

<220> 

<221> misc_binding 
<222> 76924. .76970 
<223> 99-13660-277 .probe 

<220> 

<221> inisc_binding 
<222> 91065 . . 91111 
<223> 99-13652-407 .probe 

<220> 

<221> misc_binding 
<222> 91115 . . 91161 
<223> 99-13652-357 .probe 

<220> 

<221> misc_binding 
<222> 91164. .91210 
<223> 99-13652-308 .probe 

<220> 

<221> misc^binding 
<222> 133975. .134021 
<223> 99-13671-396. probe 

<220> 

<221> misc_binding 
<222> 140043 . .140089 
<223> 99-13649-286. probe 

<220> 

<221> misc_binding 
<222> 141153 . .141199 
<223> 99-13648-259. probe 

<220> 

<221> misc_binding 
<222> 144010. .144056 
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<223> 99-13647-278 .probe 
<220> 

<221> prime r_bind 
<222> 7502 . .7520 
<223> 99-13670-305. mis 

<220> 

<221> primer_bind 
<222> 7522 . .7540 

<223> 99-13670-305 .mis complement 

<220> 

<221> primer_bind 
<222> 8173 . .8191 
<223> 99-13669-471. mis 

<220> 

<221> primer_bind 
<222> 8193 . . 8211 

<223> 99-13669-471 .mis complement 

<220> 

<221> primer_bind 
<222> 14464 . . 14482 
<223> 99-13666-275. mis 

<220> 

<221> primer_bind 
<222> 14484 , . 14502 

<223> 99-13666-275 .mis complement 
<220> 

<221> primer_bind 
<222> 19606 . . 19624 
<223> 99-13664-221. mis 

<220> 

<221> prime r_bind 
<222> 19626 . . 19644 

<223> 99-13664-221 .mis complement 
<220> 

<221> primer_bind 
<222> 20564 . .20582 
<223> 99-13663-218 .mis 

<220> 

<221> primer_bind 
<222> 20584 . .20602 

<223> 99-13663-218 .mis complement 
<220> 

<221> primer_bind 
<222> 76928. .76946 
<223> 99-13660-277. mis 

<220> 

<221> primer_bind 
<222> 76948. .76966 

<223> 99-13660-277 .mis complement 



<220> 
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<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 

<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 
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prime r_bind 
91069 . . 91087 
99-13652-407 .mis 



primer_bind 
91089 . . 91107 

99- 13 652-4 07. mis complement 



prime r_bind 
91119 . . 91137 
99-13652-357 .mis 



primer_bind 
91139. . 91157 

99- 13 652-357. mis complement 



primer_bind 
91168 . . 91186 
99-13652-308 .mis 



prime r_bind 
91188 . . 91206 

99-13652-308 .mis complement 



primer_bind 
133979 . . 133997 
99-13671-396. mis 



primer_bind 
133999 . . 134017 
99-13671-396 .mis complement 



primer_bind 
140047 . . 140065 
99-13649-286. mis 



primer_bind 
140067 . . 140085 
99-13649-286 .mis complement 



primer_bind 
141157 . . 141175 
99-13648-259. mis 



primer__bind 
141177 . . 141195 
99-13648-259 .mis complement 



primer_bind 
144014 . . 144032 
99-13647-278. mis 
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<220> 

<221> prime r_bind 

<222> 144034 .. 144052 

<223> 99-13647-278 .mis complement 

<400> 1 

caattaaagt tttgttcact ataagtcttt tttggaaaag agagagaaac attcaaatta 60 

tttacatacc agtttccatt agcatgtgaa gaacaaacag aaacactttt cagggtgaac 12 0 

aaaattcctg ctacagttat aaaatcctgc atatactctt tactttgtga ttctgaaaaa 180 

caccgttcta cctggtttat tgaaatgtgt gaaagctcta atgcaatgtt attttttaca 240 

ttttgtaaca cttaagtcat aaagccaagc tattctcaaa ccttgatgaa acatgttgga 300 

agaaattatg ttttagtgtt tggtgaaaac attatgtttc gtcacttaag gtgataaatt 3 60 

gtactcatta aagaactttg aaagttcaca catagccaat ggtttaaaat gcactaattt 42 0 

agattcccaa ttctcacaaa ggccagttac tctggaccat tcaatcgcca aagaggaaaa 480 

ctgggggcat tcccatcccg ggatatggga agtccccgag cttccagcct ggtccttgtg 540 

gccgaaaaat ggcatgtttt gcttttgctt ttggatcctg ttcgtgccgc caaaaatgtt 600 

ctgtgtgggg aaagtgcgag gggagagaaa agacacggac atgatacgtt taagggtaaa 660 

caacgtttat cccatgtaag tggccatgca gatatagtaa gcaaatgata taataataag 720 

caaatgatat aataagcaga ttgatataat aagtagattg caatggaacg gggaaaaggg 780 

aaaatacatc tacattcacc agactatgga ggattcaaca acagactggg acgcaacagc 84 0 

ctgggctcca gagtcagata ggtaggcaaa gagatcctag ttctatacag atacgtacca 900 

tggagcagtt ccactttcct aagcacattc agttgtgata aaaatagatg agtttcaagg 960 

gctgatacat tacatgccac actcaaagtt gtgttgttaa acaatttcaa ttgttgttac 1020 

aatttcaaat aaaagcaatg tttacaacca tgggttcaag agaagtctaa gtgaacacat 1080 

ataataaaga cttgcaaaat aataaaagat aaggctcttt aactatcaaa agacttgcag 1140 

aaaagaacca cagaaaacca ttttaaatat aactgccttc gtatgtaaga aattctacat 12 00 

tatttttgat gttaaaacat caatctcatg cttactaggc tatttcttaa tgacacatgt 1260 

atttacaaat ttgagagaag aggaagaaat atcaggtgac accactgggt taatgcataa 1320 

atgacaaacc taaatgcatt ttaatttcct tttctttaaa tcgagctgag cttcagcccc 1380 

ttctttttgt ggtgttctta gtcatctacc ttatcacagt aatcacaatg taagcatgat 1440 

cttctttttt ttttttaagt gcacaatatt tttaactgtt aacaatatac ctattgttac 1500 

ctatgggcac aatgatatac agcatatctc tagaatttat tcttgcaaaa ctataacttt 1560 

atacctgctg aacagcaaca ccccatttct ccctttcctc cagccgctgc aaccaccttc 1620 

tattctctgt ttctatgagt ttgactattt tggattcctc atataaattt aatcatgcag 1680 

tatttgtcct tccgtgcctg gcttatttca cttaacataa tgtcctccag gttcatcata 1740 

tgacaggatt tcttcttttt cttaatgatg aataatattc cattacatgt gtgtactaca 1800 

ttttcttcat ctttcaatgg acatttaggt tgtttctata tctggactat tgtaaataat 1860 

ggtgcaatga acataagagt acctatgtct cttcaagagc ttgatttaaa ttcttttgga 1920 

tatatgccca gaagtgcaat tgctggttta tatgataatt cgatttttaa ttatttgaag 1980 

actcatcata ctgtttttta tagtggctgc acaattttat attcccacca atgttgtaca 2040 

agggttccaa tttcttcata tgtcaccaat atttgttgtc ttttggattt ttttaaaata 2100 

aagtaacagc catcataaca aatgtgatat catgcttttg tttcatatgc attttcctga 2160 

tgattagtgt gttgagcacc ttttcatttt tatttattta tttatttata ctctaagttc 2220 

tgggatacat ctgcagaaca cgcaggtttg ttacataggt atacatgtgc catggtggtt 22 80 

tgctgcgccc attaacctgt catctacatt aggtatttct gctaatacta tccctccccc 2340 

agcccccgac cccctgacag gtcctggtgt gtgatgtttc ccttcctgtg tccatgtgtg 2400 

tgagcatgag cttcttaata agaagtgatt caacactaca cactccaatg tgcttgttcc 24 60 

tcagtcatct ctcctttgta gatctctatt atgccaccaa tgccactcct ccgatgctgg 2520 

ttaacttttt ttttccaaga gaaaaaccgt ttcctttatt ggttgcttta tccaatttca 2580 

ccttttcatt gcactggtga tcacagatta tcatatgctc acagtgatgg tgtatgacca 2640 

ctacatggcc atctgcaagc ctttgttata tggaagcaaa atgtccaggt gtgtctgcct 2700 

ctgtctcact gctgctccct atatttatgg ctctgcaaat ggtctggtac aggtcatcct 2760 

gatgctttgt ctgttcttct gtgaacccaa tgagatcaac cacttttttt ttttttggag 2820 

aaaatgcatt atatgcacat ttaattccac tataaatttt tgaatggacg gttggagagg 2880 

aagggagaaa tacatattaa cggagagaat accacccaga aagtatatac aatgggagaa 2 94 0 

aggaacctgt tgatccaagt ttccatattc ttattatggc atataaggtc atgattattt 3000 

tctcagtatg aagcatctcc cagggctgac tctgatgtaa aattggagat caaccacttt 3060 

tattatgcag aaccacccct cttagtcctc gcctgcttgg atacttatgt caaagaaact 3120 

gccatgttca tggtggctgg ttccaacctc atctgccctc tcactatcat ctttatttcc 3180 

tacactttca tcttcacaga cattctgcat atctgcactg ctgagggaag gtacaatgcc 3240 

ttctccacct gcgggtccct tgtgactgcc gtcactgtct ttcaaggaac gctgtttcac 33 0 0 

atgtgcctga ggcccccttc tgaggcatct gtagaacagg ggaaaattgt agctgctttt 33 60 
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tatatctttg tgagtcctac gttaaaccca ttgatctacc gtctgaggaa taaaaatgtt 3420 

aaaagaacaa taagggaagt tatccaaaag aaactgtttg ctaagtaagg tagatatttt 3480 

agttgcaggt tatgtaatac attattttta tcttaccaat taacgagcat tataaattaa 3540 

caaatcactt tctgtcattg agtgtttttt gtcttttgta acttgcatat gggaattgaa 3600 

agtgtatacc aaattattag ctagagttga cagtgtcatc tcagtgaatt taagaagaaa 3660 

tcatagaaat ttaaatagaa gacttatggc atgtaaaagt caataaagaa cagtgattcc 3720 

ttctttagta ctcatattgg tagcaaacga taaaagacag aatgcaatgg aaattacagt 3780 

tcattacatt tttatagtac ttaataactt ccaaactatt ttctagacac ctttcaaaca 3 840 

tagtatatga agttttctcc tttcttttat acagataatg caacaataaa gatcactgat 3900 

gtagggaaaa gagagatcag actgttactg tgtctatgta gaaaaggaag gcataagaaa 3 960 

cttcattttg acttgtaccc tgaacaattg ttttgtcctg agatgctgtt aatctgtaac 4020 

tttgccccaa ccttgagctt ataaaaacat gtgttgtatg gaatcaaggt ttaagggatc 4080 

tagggctgtg caggatgtgc cttgttagca gaatgtatac aggcagtatg cttggtaaaa 4140 

gtcatcgcca ttctccagtc tccataaacc aggggcacaa tgcactgtgg aaagtcacag 42 00 

ggacctctgc cctggaaagc cgggtattgc caaggtttct ccccatgtga tagtctgaaa 4260 

tatggcctcg tgggatggga aagacctgac cggcccccag cctgacaccc gtgaagggtc 4320 

tgtgctcggg aggattagta aaagaggatg gcctcttata gctgagataa gaggaaggcc 4380 

tctgtctcct gcctgcccct gggaactgaa tgtctcggta taaaacctga ttgcaccttt 4440 

cttctattct gagataggag aaaaaccgcc ctgtggcggg aggcgagaca tgttggcagc 4500 

aatgctgctt tgttattctt tactccactg agatgtttgg gtggagagaa ggaaaaatct 4560 

ggcttacgtg cacatccagg catagtacct ccccttgaac ttatttgtga cacagattcc 4620 

tttgctcaca tgttttcttg ctgaccttct ccctattatc accctgttct cctaccacat 4680 

tcctcttgct gagatagtga aaatagtaat caataaaaac tgagggatct cagagaccgg 4740 

tgccgatgca ggtcttccat atgctgagcg ccagtcccct ggggccactg ttctttctct 4800 

atactttgtc tctgtgtctt atttcttttc tcagtctctc atcccacctg acgagatata 4860 

cccacaggtg cagaggggca ggccacccct tcaattgaag tatatctcag aatactactt 4920 

tgagatacag tcttagaatt atattttgag ccaatgaaat cttctttctt gaagcttttg 4980 

aagcaatgcc aaatttccgt tagtaggctt tataaatatc attgtttgca ttaccaggag 5040 

gcattcacat caatatgtga cctcacttct ccactctttc attgccattg aagcagatac 5100 

tttcaagtat gtcttaatat attgattttt atcttctcat tgggggaaca tgggaagtgt 5160 

cacatgtggg actacaccgt aatttgggta tttgtagtct taaggttttc atgaagcttc 5220 

gtgtgggcct ccatttctct agaacgattt gatgtgttcg ttttttatcc ttcacagcaa 5280 

cacatgctta ggcagatgaa tcactgcagc agcatttaga cacatttgtg attcagggat 5340 

agatagctct tcagtaggat ggtgtgaatt ttgggataat ggcacatact taaaacagaa 5400 

ctaccttttg acccagcaat cccattactg ggtatataca ccaaggaata taaatcattc 5460 

caccataaag acacatgcat gtgtatgttc atcacaacag tattcacaat agcaaagaca 552 0 

tggaatcaac ctaaatgccc ttcaacagta gattggatac aaaaaaatgt ggtatatata 5580 

catcatggaa tgctatgcag ccgtaaaaaa aaagaatgag attatgtctt ttgtagcaac 5640 

acggatgaag ctggaggaca atatccaaag caaaccaatg caggaacagg aaaccaaatg 5700 

ctgcaagttc tcacttacat gtggaagcta aacattgaat acacatggac acaaaaaagg 5760 

gaacaacagg caccaggacc tacttgaggg tgaagtgtgg gaggagggta aggattgaaa 5 82 0 

ctctgcctat caggcactgt gcttatcaac tgggtgatga aataacctgt acaccaaaac 588 0 

cctgtgacat ggaatttacc tttataacaa agctgcacat ggacccctga acctaaaata 5940 

aaagttaaaa aaagaaatct gtcccaagga gactgttttc tcttaatgtg ctgcatcctg 6000 

cttaatgaac tatggatttc atgcattctt tttcaagatt atattgccta cctgattgta 6060 

gacagtttga tgcattttac atagtatcag ttaaacatta aacataatta agagcatttg 612 0 

gcttcaaaat aatagtaaat gggtagaatt tattatggtt atagtactac tcatacaaat 6180 

aataatacaa catcagtgat gtagtgtcta gtgagcatga cactattata gaacacttct 6240 

taggctggat tttgataata atagcatgct ataacttttg aataaaaata gtaaattgaa 63 00 

taatcacaaa caagtaaaaa tctaaagggc cagtagtatg tattcaacta gcttaccagt 6360 

gtatcatttg tgtagctaaa tccattcgct gtccctctag cagacacaca tgctagttat 6420 

tgcagtggaa gaaaaatgaa atgaactaag gaattaatgt ctttgagtaa tataaacaga 6480 

gccttctgga ggttttctat gaaaaataac ataagtatgt gtaaaactct tctttgagta 6540 

agtatcatac acatagctaa attctgtatt ttctattcat tgctgtataa ataaattaac 6600 

attgccttct ttcgtgtgga catctgatca tgtgattcca tacttaaaat tacatatact 6660 

ttgctctttc taaatgaaaa tagcccattc atctatactt ttcatctaac cattcagtta 6720 

ttatagggct agttttattt ttgtcagaaa attcactttg taaatcttat gttttattat 6780 

gtagccattg tataccaaca tataaaagaa aatacataca tacatacaca catacattag 6840 

ggaatttttt cttcaaagtg aaagcagtct acttaaaaat tatccaaatc ttcaacattt 6900 

ttatgcaaaa caaggtacct ctacattaag ggaggaggaa atgagggaca ccagttttat 6960 

aatcttatga taccttatat tccatcttaa tttttttttt tgagatgaag tctcgctctg 7020 

tcacccaggc tgaggtgcag tggtgtaatt ttggctcatt gcaacctctg cctcctgggt 7080 

tcaagcaatt ctccctgcct cagcctcctg agtagctggg attacaggca cgcaccacca 7140 
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aacctggcta agttttgtat ttttagtaga gacagggttt cgccatgttg gccaggctgg 7200 
cctcaaactc ttgacctcag gtgatctagc ggccttggcc tcccaaagtg ctgggattat 7260 

aggcaagagc caccgtgccc ggcttccaaa atatttaagc aatattattg cattttacaa 7320 

ttttagtaat gcaaggaacc aaaataaaac agaataattg agatagagaa gactttacac 73 8 0 

atcaatttga aataaacata taggaatgcc ctatattctc aaatttcatg ggatgtaaca 744 0 

aatctacatt ttgattttga tatatagcta tattttatta aatgatttct cagagtataa 7500 

cccaccaccc gcaactctaa kaaaattagg gatgattctc cgtcttggtc agactgtact 7560 

ttgatccatt tgtgctaacc tggaactata tgtgcactgg aagatacaga ctaatgaacg 7620 

catctctagg tccctttgtc ctccaacaaa tacagtgcta cacaattttc aaatattctc 7680 

attctatttg caattccctt tctaaatcaa caattttatg catcatcaat tttataaata 7740 

gccctggttc tgagaccttt gatgattagc atgttaatat taaccttgat agtgcagact 7800 

agttccaaaa taaatccata acgcccgtcc tccaaaggat cctgggccct gaccaatact 7860 

tctgcctcct tctactgatt atgccactct tctgctctca cagtctttga aatctttcat 7920 

cttctcaaat gtctcacctg ccacaacttc ctacccctca ctcattagat gacctcactg 7980 

cagtgtttat aggcaaaagc tgaagagttc tgatgagaat tccttgtttt catcatacta 8040 

aaaaggcaga agttatctac ccatgttttc ctcttccatg ttattgaaag ggatggctaa 8100 

tttctctgtt tgtactaagg atctcatgca cacttgtacc gaaactctac ataaatgtaa 8160 

aataggaatt ttatcctcgc aggagagtct akctccagga agcatcaaat gaaggttatg 822 0 

ctcttgagga acaaagcata ttttaattgg ctctttagaa ttagtttaaa aatactctag 8280 

ggaagatacc taataaatat attgtaaatc ttgctacctt gttttcataa gatatggtcc 8340 

attaacatta acaggtaggt cattttctac atttgttcac taaaaataca atttgtgtgt 8400 

gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt gtgtgtagag ataagtattc catcaatggg 8460 

gaaaacaaat attttctcca atttcactag acttttctat ccatatgtta gtattgcatt 8520 

gactctctac aatcaattat atgtaagctc ttttgatgta tacatgagaa ctaaataaac 8580 

aatggcttta aggtggaatt tttggagtag aaatttttac aatcttatgt tactttcaaa 864 0 

gagaactgag agggagtggc ttgagttgga agctaagacc ccttttaaaa attatctgtg 8700 

catacctaaa caatattgtc atgagtgggt tagtggcaca aaaatgggaa caaaaaaaat 8760 

ccacactttc cccaaatagt agaattgcat atctgctgtt taaatgctgc attcctagaa 8820 

agatcttctt tgagtggtca tctaaaagca tcctacattc atgttttctt cacagaaaag 8880 

tggtttgcct gaaattgtat gtaacttttt ctgtgtgttt aatctctgtt ccctctatta 8940 

taatgtgtgc tctgctccct ctattataat gtgtgctccc tgagctggga gacttgttta 9000 

ctgtagtaac cgcaggaatt ggaacagaaa gaaatgaagc tatacagtat acatgagtaa 9060 

acaggcagtg acattacaaa gtggaaaaaa acaagtcttc attttgtacc ctcttagcca 9120 

tatatcagat agaaattaat ttctctagtt taatcgttcc tgaataaagg taaggcacac 9180 

aactatgggt cttaattgaa aatgctttgc ttttctttct tcattttgta tctgaaacaa 9240 

tacaatatca gagctggagg tataataaag atcagacttc ctttatttat ccatttgaaa 93 0 0 

gatgcaaata acctagggtt tttgtattta attttcattc ctttggattt tttgtttcct 9360 

cacgaagttt gaataaaatt accaaatgtg gagtacacca agaagacagg tataaatgta 942 0 

ggaatgaata aacttatgta tgtatacatg tatggcagag agaaatagag aatatgtatg 9480 

tttgtgtaag ttatgtgggt ttgatgtata gaaagataca gattaaaaca gacatatagg 954 0 

gagacaatgt tatgtaaaat ttccgatgtg attattgaaa caagagaagt aattgtcacc 9600 

tagataaata gatgaatgag cgaatgataa atggatgaaa caaatgccaa atctgaatca 9660 

gagagaaatc ctcacattct ttgtcacttt cagtttcaag agataagaag atgttctccc 972 0 

caaaccacac catagtgaca gaattcattc tcttgggact gacagacgac ccagtgctag 9780 

agaagatcct gtttggggta ttccttgcga tctacctaat cacactggca ggcaacctgt 9840 

gcatgatcct gctgatcagg accaattccc acctgcaaac acccatgtat ttcttccttg 9900 

gccacctctc ctttgtagac atttgctatt cttccaatgt tactccaaat atgctgcaca 9960 

atttcctctc agaacagaag accatctcct acgctggatg cttcacacag tgtcttctct 10020 

tcatcgccct agtgatcact gagttttact tccttgcttc aatggcattg gatcgctatg 10080 

tagccatttg cagcccttta cattacagtt ccaggatgtc caagaacatt tgcatctctc 10140 

tggtcactgt gccttacatg tatggcttcc ttaatgggct ctctcagaca ctgctgacct 10200 

ttcacttatc cttctgtggc tcccttgaaa tcaatcattt ctactgcgct gatcctcctc 10260 

ttatcatgct ggcctgctct gacacccgtg tcaaaaagat ggcaatgttt gtagttgcag 1032 0 

gctttactct ctcaagctct ctcttcatca ttcttctgtc ctatcttttc atttttgcag 10380 

cgatcttcag gatccgttct gctgaaggca ggcacaaagc cttttctacg tgtgcttccc 10440 

acctgacaat agtcactttg ttttatggaa ccctcttctg catgtacgta aggcctccat 10500 

cagagaagtc tgtagaggag tccaaaataa ctgcagtctt ttatactttt ttgaccccaa 10560 

tgctgaaccc attgatctat agcctacgga acacagatgt aatccttgcc atgcaacaaa 1062 0 

tgattagggg aaaatccttt cataaaattg cagtttaggc ttgtgtttat ttgcagtcac 10680 

gaattgcttg tggagtaaca aactggcttt tgaaatggaa aaacctagtg tagtcgtgat 1074 0 

ttatttaaca tcatggactg tcagtaacca ctttactttc ttatccaaat gaaaaccttg 10800 

aagattgatt tcttagaaat aaaagcctta atgttgagaa atttaaaatg ttttatttgt 10860 

cagaaattct atgaaaataa attttttagt atctaataat tctatatgaa aatactatgt 10920 
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ttacttactt 
ttcatctcct 
aattttacac 
gctcagctac 
tgatacgtta 
ttcttttttg 
acaccttttt 
gcattccagt 
aatagggtct 
aataggggtg 
tttctgtggc 
agaaaatcaa 
accagatgac 
ggcattttaa 
atgtctgtat 
attaagcaga 
taagtctggt 
atttcggtga 
gttttggcaa 
gaaaagctgg 
tcatcggtgg 
attccatttc 
ggattttaga 
tttattttct 
ccttttagtc 
ttctatttgt 
ttctttggag 
gtcacagtgt 
agagaattat 
tgcccaaagt 
tctctttcta 
tggtggaatg 
ttatgtgata 
gttaacataa 
aattttgata 
cagtcgcaaa 
agtaaaattt 
catttcttct 
aatcacttat 
tatatagatt 
atgaccttaa 
atacataaga 
agcattcact 
atgtatctta 
atgttcccag 
aatttttcta 
aatatttgag 
ttatttgctt 
cagaataaaa 
ctcttcatgc 
aatgtgacct 
aagcacttat 
catttttgta 
tcctcccact 
ttataccata 
tatattttaa 
ggaaaaatgt 
agatatccag 
tgcatacttt 
ctgtaaatta 
ctgcttattg 
ggagaaaaca 
catttatttt 



ggggggtggg 
ttgaatatga 
tttttaaaaa 
ctttgtgctt 
aatttgctgt 
ttcatgtgat 
atgagtttcc 
tatgttttcc 
attaactgcc 
tatatatgtt 
ccattatttt 
tctcagagag 
tgcatattga 
tattttgcac 
gatatcaact 
tgacttgcag 
tctagccaaa 
caagaattta 
tgatggtggg 
gaattgagtg 
ggtatttctt 
tggcaagcag 
atgagtgcaa 
agttatgaga 
aaaatgactg 
gatttaattt 
ataggataga 
agtgctgatt 
tatctccatt 
catttggcca 
attcatactg 
ttggagatta 
gttatatgta 
tttataggca 
tacagaagac 
ggcatacata 
atgagggatc 
caataagttt 
catatacaaa 
aaatatttct 
tacaataata 
catagaagca 
tgtgttttat 
gtgtttttat 
aagtggaact 
tgggtaaatt 
aataatcatt 
ctttgatgtc 
attgaagcat 
tattgctgat 
tttgcttatt 
attcattgca 
ctgtgttgta 
atttattaag 
agtgaatcta 
actaaattcg 
tatcttcact 
tatttttaac 
atatatgcaa 
tttgaaaaac 
ttaagtttgt 
cttaaccttg 
tataccatac 



taaattttta 
tggtccacag 
atgaaattat 
attgtaatca 
aagtgttccc 
aatactggtt 
aagtgtatgg 
agtgccgtac 
ctgtgcatac 
tctggaggaa 
taaaataagg 
gctacctaag 
tacatatgta 
tctgatatta 
actatatcta 
tttttgtaaa 
tcctctttca 
acaattacat 
acaacagagg 
acacagagtc 
tattttatga 
aatggtctca 
aaaatactct 
attgctctgg 
acctgtttct 
tttccttctg 
acataaatcg 
ttaggaacac 
tacaacattg 
ctcagtagga 
tagttgttat 
tgaattgatg 
tcctaagaat 
ttataattaa 
atatatgcta 
ttcatattaa 
tgtcctgtga 
atatgcgtac 
tttgaacact 
gatgaacatc 
atacatatca 
acatttactt 
tccatttgat 
gactgatgta 
gctgaattaa 
tttctccaga 
tcactataac 
tcaagggaag 
atattgaaca 
aatttcttgc 
gctgttatgt 
atatttgata 
atattaaaaa 
caatgctttg 
tttctgccta 
tatgattagt 
tttctcttag 
ccataaatta 
atatatattt 
aawgaaaact 
tttccagaaa 
gtatattaat 
cggcgttcat 
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actaacttag 
agccataatg 
ttctacttca 
atgtgcaaaa 
caaacccatc 
atcgtccttg 
atccaattct 
ctctgactta 
ccataaatgt 
atactcagtg 
gtgttctagt 
gagacacatt 
tttagcccta 
actaacacca 
ctataaagtg 
atgggggcat 
gaaacagtat 
tttagcactg 
aaggtcatct 
ctcacttcct 
gaaatttcca 
tatccagatt 
atatttggcc 
cttgcagcaa 
aattgatttt 
catgtctttg 
acatgaataa 
tttattttaa 
gagcctgagg 
caaaagcagg 
taaagaaatt 
agattattgc 
agctaatgag 
aattgtctta 
tttaggctgc 
ttaattacca 
caaaggatac 
atgttactaa 
gattttcact 
atgttgaatg 
attgtagaaa 
ataatcacct 
tatcttgcaa 
tctgtgcatc 
atgttttcaa 
aagcttctag 
tctatcagta 
tcttagaaaa 
aatgaaggct 
atgaataatg 
aagttttcct 
gtgatatcag 
ttatcttttc 
tttttcaact 
tttcctcttt 
gtacacttaa 
gagactgcaa 
tgcatactat 
caatgtggct 
attttcctct 
ggcaggaaaa 
tctatcgatt 
aataaaacac 



cctgagagaa 
aggttcaaga 
atatggtgtc 
aatataaata 
ttgtaaatat 
attctgaaaa 
atcatcaaac 
ggactaattc 
cttgaaacag 
tttattttgc 
agttcttgaa 
aatccatgca 
ataaattgcc 
tacagaatat 
ttcaaatggt 
attcttgagt 
ctgaaaaaga 
tttgtttttg 
gcagagacaa 
ttttgagggc 
tctattaggc 
tgaaacccta 
taaataactg 
ccatttttaa 
taataaaata 
ggtaaaagtc 
aagtaattat 
tcttcataaa 
cacggagaga 
tttcttggtt 
ctcaaaacac 
tatagatgaa 
aatttccctt 
acgtccatgt 
attttattaa 
ggtacagaca 
attttcttta 
aatgcatcaa 
tgatataatg 
tcatataata 
actgactgaa 
acctgaagag 
accatgattc 
tctaattgtt 
tgcttttgaa 
caacttaaac 
gtaggagctc 
taattctgtt 
acacataagg 
tttgtatttc 
catcctaaaa 
gaattatttt 
cctcttattt 
gatttgtgat 
atgtttggcc 
aatatgtttt 
atcatctgat 
tgttcaattt 
atacatgaaa 
ctatgggggc 
cgttggtggc 
cattaacaac 
tttttttcaa 



attctgaaaa 
tagggtggat 
ttgaacaata 
tatgggtcac 
cttgagaagc 
tcaactctca 
tactttaact 
agtaggagtc 
tatacaaata 
ttcatgaagc 
gctgctatag 
ttattagtca 
tcatatgttt 
gctacagaaa 
tttccagaat 
ctcatttgat 
aggaaggcaa 
tatttgcttg 
caatgaagca 
cagcatcatt 
aaagcattgt 
ttfegaaaatt 
tacgattcca 
actttgacat 
ttttatttat 
acctcagaaa 
aagagttgca 
tgaaataatg 
ttaaatttct 
tcaaattctc 
tgcaaatgct 
atagtggtta 
gaaaattccg 
gaagggagaa 
tatattttcc 
aatatgaggg 
tattttgtta 
gcaaatgtga 
tatgcacctt 
gttagatcat 
ccatattgac 
agctgaactc 
catcatatca 
ccttttgtgt 
actcctgcta 
ctgtactaga 
ttgtttatgg 
ctaataagca 
ttgataattt 
tatttggaaa 
ttgaaaaaac 
agatataata 
ttaaccaatt 
gtcaccttta 
acaatataca 
aatttctgat 
gattgctttt 
agaaactata 
tacttaaaca 
cataggaaaa 
ctgtgagtaa 
tttatacaaa 
caaatatctc 



10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
1314 0 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
14700 
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taaatacaga 
gttgacatgc 
ttaattagaa 
ttcctaagaa 
ttaaaaataa 
cacaattact 
ataaataaat 
acgcagaaaa 
aagacagctg 
ctagaaataa 
cagctattcc 
catggtgttc 
tcccacttag 
tgcatggttc 
ctggtaaagc 
catcagcctt 
ttatacagct 
atatattatg 
ataaagagaa 
gctattacat 
caggatatat 
aaaagcaagg 
ctggtttcca 
aaattccata 
tcatattaga 
aacagtgtca 
ccacggcaga 
ctaatcttgt 
aactacctcc 
gacacggttt 
ttgtagtttt 
gtaatactgt 
attgtggagt 
tgtacctccc 
caaacttatc 
ctccggaacg 
tttgctagtg 
ttcctgcatt 
agagcataca 
tgtaaaatga 
tcactgcaac 
tgggattaca 
ggttcaccat 
agcctcccaa 
tatcacctag 
ctgctttagg 
gtcttattct 
atgacagtgg 
tgagagatag 
ttctaatcac 
tttcaaggtc 
aacatttttt 
tatcaataca 
atatccattc 
gctcttttag 
atcttggaag 
aaaataggtt 
catatttaac 
aataattaaa 
atttacctac 
caaaaatgct 
gtgtttctgg 
aattttgtaa 



ttcttttctc 
ccacatatat 
ttaaaataaa 
tgggaatact 
gacccaccct 
gaaatatgtt 
gttgaatact 
tagtggcagc 
aaaaagtcta 
atacaatatt 
aaatttaaga 
attttataat 
caccgtacca 
ttacttagat 
acaggcatct 
gatcatgaag 
attaaagtga 
tgcaaagaca 
acaaaactta 
tgagaaaata 
aattacagga 
agttaaaaac 
gaacagaatg 
acccaagggt 
agtgtaaggg 
cactgaaagg 
catccaactt 
ggggaataat 
ttctacaacc 
tttacctggt 
gcctggattg 
gacgtcctat 
agtagtttga 
ttcttagcct 
ttccagtttg 
aagacctcta 
ggtcagtaga 
cttggatcct 
aagacttctg 
agtcctgctg 
ctctgcctcc 
ggcatgtgcc 
gttggctagg 
agtgctggga 
ttgttgtaac 
ataatgggaa 
ttagctgtaa 
ataaggcatt 
gtaaacctat 
ccaaagcttt 
gctcacagta 
aactccctga 
ttcagcctga 
ctacacctgt 
ggtgtagcac 
gagaaaaggg 
atttaaatat 
cactccatgt 
ggtatttggt 
cagtttacac 
gctacagtta 
gtagttggtt 
cactttttgt 



taactatgct 
tcttctgcta 
tattaaacat 
tcacagtaat 
acagtctggc 
atgttaatat 
atacgtagaa 
taaaaacatg 
gcacgttcat 
attaatattt 
atctactgaa 
caagaaaaag 
tgtaacagaa 
attacaaaat 
accagatggg 
gaaagtgtaa 
ggatgttcct 
tatccaatgc 
ttaatagcta 
agctaacaaa 
tgcttattta 
ttgttgtaga 
accagttagc 
gaggtggggc 
gagcagtttc 
gatttcttgg 
ccctgtgttc 
ggaggaattg 
cattctgtat 
ggagtgacac 
ggttgttgta 
gggatctcct 
tttcatcttg 
gttgacttag 
atggaatcat 
ggcccacaga 
ggtgatggta 
ggctatggaa 
gagaaatttg 
tgtcttccag 
caggttcatg 
accacggctg 
ctggtctcga 
ttacaggcat 
tgttacttca 
acatggtaag 
aatgagtgcc 
ctgggagtcc 
atccagagga 
tggattcctc 
ggccatcctt 
tttgtaacat 
tccaacttta 
tctccagaat 
acctccttgt 
attaaagtct 
ttggctattc 
aatatgtaat 
caccataatt 
aaacaggcaa 
aaaatctgca 
tactgaaata 
aacattttgt 
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gactttttgc 
ggaatcattt 
ttaaataatc 
actgttgctt 
aatgtatttg 
aaatcatttg 
aattgattaa 
aagtacaata 
acattggttg 
taaggagtca 
ggcataatta 
caaacaaatg 
acctttctat 
ctcaaagtga 
gtagaaggtg 
agtacagttt 
agttaagaaa 
tacatatgga 
tgtccaaatt 
aactttgtat 
taatagtaga 
gcgtgatggc 
aacttctcac 
acactcttgg 
actttgactg 
gcctgcgttc 
cagggcactt 
gcagggcttg 
tccctttacc 
aaatcttcat 
gtttctcact 
gtattccaca 
atagtctggg 
aggcaggaag 
ctttctgtct 
acataatgtc 
agtggcacca 
gaaacagtac 
ccccaggcct 
aggcctatag 
cgattctctt 
gctaactttt 
actcctgacc 
gagccaccgt 
aaaggctttt 
acaagtatat 
ttggtcagag 
acgaatggta 
agtgtctaat 
cctctacatt 
tgatccatat 
tgagtgcaga 
tattcctccc 
tctgcttata 
agacttttgg 
gaatgtattt 
taagtccaaa 
tatgtcttga 
tttttctaaa 
agcaaaaaca 
tatactcttt 
ggtgaaagct 
aacatttggt 



caacaactcc 
caaacactga 
agagaaaaat 
taatttgaca 
tcacaaataa 
aagcactaaa 
aatacaaaat 
ctgtttatgt 
attaatggta 
aatgtttcta 
gaaacttgtg 
atgttcgaaa 
atggcaaacg 
ggactgcctg 
tttctaaata 
aatttttttt 
aaggtaaatc 
tatcttgtaa 
ttccatgata 
gcccagttta 
atttaaaaat 
agaagaggag 
agataagaat 
agcacagaaa 
tgttgcttct 
tctagtggag 
cccaaggggc 
accccttatg 
ttcagcaagc 
tcctgatggg 
gaccttaatc 
aatactcttc 
tcagtcaccc 
agctcaaagt 
tctggtggca 
acagaaacag 
cttccacttc 
cacatattgg 
gcattttttt 
tgcagtgtca 
gtctcagcct 
gtatttttag 
tcaagtgatc 
gcccagcccc 
ccacctttct 
cccatgagca 
gcaatggtct 
atcttggcag 
gtgactaatc 
aaaccaggga 
ttcagctaat 
atctctgctt 
accattatcc 
taaactagaa 
cttacaccaa 
cctaaagaag 
gctactcttt 
catacagaac 
aacggagaaa 
aaagcacttg 
actttgtcat 
gtactgtaat 
gaagaaatta 



acccaaaaga 
aaaaacccta 
tactctgtgc 
ataaaactat 
gacaaagtgg 
acatgaccat 
gctaaaatta 
gtctatattt 
taaaattttt 
taattttatt 
tgaagagttt 
tacaagtagc 
tgtcttgttt 
ctttggataa 
actagaatgt 
tcttagaatg 
attttgtgta 
aatatcaggg 
agcaattgtt 
attcagatct 
aaaagtagga 
gtgctcagtt 
gcctaggtaa 
ctgggaaaag 
ttcccgacca 
aaaagagagc 
ctagttctgt 
gtcagttcat 
accttagcag 
tctgggccat 
acaggacatg 
cttaacctcc 
cagccaacac 
ggccaggtgg 
aagttcctcc 
gaaggaaaat 
caccccttga 
atgcctactc 
tttttttttt 
cgatcttggg 
cccgaatagc 
tatggatgag 
cacccgcttc 
tgtaataaag 
atcaatccat 
tgagcccact 
gtggaatacc 
aagcattgca 
cactctagaa 
agatctggca 
aaactattag 
agtgggccca 
cacaccctta 
aactcaagca 
ctagcaccag 
aagaccaacc 
aaaaggaagc 
taggtaatgc 
tgtgcaaatt 
tcaaggtgga 
tcttaaaaat 
gtcattttta 
tgtttcttca 



14760 
14820 
14880 
14 94 0 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 
16140 
16200 
16260 
16320 
16380 
16440 
16500 
16560 
16620 
16680 
16740 
16800 
16860 
16920 
16980 
17040 
17100 
17160 
17220 
17280 
17340 
17400 
17460 
17520 
17580 
17640 
17700 
17760 
17820 
17880 
17940 
18000 
18060 
18120 
18180 
18240 
18300 
18360 
18420 
18480 
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cttaaggaga 
gcaaaaattg 
atgtattcca 
gttggagaag 
aacttgtttt 
tacatagaat 
ttatctgata 
aatgtttatt 
actgccaatg 
ctccttattt 
gaacacagca 
ctctagaaac 
atgggtaatt 
agatatatat 
gaggaacact 
atgagataag 
acagcatttc 
tttaagataa 
tagcaattgt 
gtatyttaaa 
aaattaaaaa 
gctacaaagg 
tgttcttcct 
aaaaagagtt 
attgataagg 
cccattgctg 
ctaaaccaca 
ctcccatgtg 
ttaagttcta 
cataatagtt 
gatgctctgg 
atttaatgaa 
aaaagtcaag 
gtggcttctg 
cttcccaaga 
ttrtaatgaa 
tctgacatcc 
tgccctgttg 
agaagacaat 
ttatctataa 
tgtacaaaac 
ccaatggtgt 
caggaactta 
aataatgaat 
agttggcaaa 
gttgttcata 
cacacagtaa 
tcatgcttct 
aaatatttta 
tttatattaa 
tgaaaagaaa 
tcaggaggca 
ccctaaagga 
ccaaataata 
aaacttttac 
ataagaaaac 
agaggcactg 
ctgctggctt 
tctggtatta 
taagagctgt 
ttatagccta 
tttctttttc 
aagacaaaga 



taaattgtgt 
actgaatatg 
tacacatgtt 
cagaattagt 
accacatctc 
tttacacaga 
gcacagcaca 
aaataacagc 
aaaaacaaag 
ggtacagatg 
agagctaggt 
accccaagcc 
atgtttcaga 
atgttgtaat 
cagtgcctag 
aagtagtcat 
ttcaaagttg 
aagatctccc 
atcaatttct 
agtcatagag 
ggactgagat 
aggcagagga 
acaccttcca 
attgtgatgg 
atgctgaagc 
tactgtatct 
ttctcaaatc 
tagactctaa 
acatgatgaa 
attattactg 
ccaggatctg 
aagacagaga 
gctaactgta 
tcacagagct 
cacacatccc 
gcaggaaggg 
tagagagtct 
accaggtcca 
agatgtgaat 
atccaaaatg 
tcaaaattga 
tccagactct 
atatttccca 
cccattccta 
actcatgtgc 
atttgaatga 
ttgcattata 
ttctcaggta 
attgatagaa 
agtgttgcta 
ataggccaaa 
caaatctagg 
gctcatgctt 
ctagtagcct 
tgaattaact 
tgaggcacac 
ggtttttcac 
tcaatagtaa 
aatcatcctt 
atcatgtaat 
tattacttaa 
cactccaaat 
ccaaccttct 



tcattataga 
aattacttta 
tttggtgttt 
tagatttata 
tgttcagctg 
attgtaaaac 
tacgttgcaa 
cataggtgtg 
tttatttaac 
aaaacgctga 
ttcaattaaa 
acatgggttt 
cagttttaca 
gcatatatat 
gtaaaaagaa 
ggtcatgact 
taatttttat 
aatattttca 
ttattcttat 
aaaatatttg 
ttctgaaaca 
ttgaatgact 
tcatcctatc 
ccttgatgtt 
tgttcatggc 
gagtgacttc 
caaattcacc 
ttgagtcagg 
tgtagaagta 
ctactaactc 
agatgtccac 
ataaaaaggt 
gaaaaccatt 
cagtgagagc 
aagacattct 
aatgagaaag 
tctcaaaagc 
taggagccca 
gggagtaggg 
ttccacactt 
tatatcacta 
gaaaccttct 
aagtctaaca 
catcactctt 
tccgtgagct 
ctttttgcta 
aactctacag 
ggattagata 
ataaatggag 
ttcttttcag 
agaattttta 
tgctgctgag 
tgatgttgag 
caatttgtta 
ttctcatcaa 
agaaattaat 
ccaggaagtg 
actatgctta 
ctgattcatt 
attttatagc 
aaataataaa 
tgaaccatga 
agactctttg 
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actttgaaag 
cacttatgtg 
taaattatct 
gcacactcac 
gatggggctc 
ctgcaaattt 
ctcttactct 
cgaaggattt 
atgtgcagta 
agctcagaag 
atattattct 
gttaagatgt 
attttacaga 
atattcatat 
tggagttgga 
tactttcctg 
tgagaataaa 
aagcatcatt 
tttcattgta 
gccatggaaa 
catacattgt 
gaagaaaaga 
tcaggagggt 
agtccaaaac 
tgtggaaagg 
acctcagagg 
agcaataatg 
atccaaaagg 
ttccaaattt 
actgcaagac 
aattctactc 
aaagttaaag 
tccatcttta 
tgaagccagg 
ccagggaggg 
aaattgaggt 
caaggtcagc 
actccaggtg 
tggaaaagga 
acacattact 
gttcaagttc 
cccgatacca 
tattaaatta 
gggggatgcg 
taggattttc 
gaaataattg 
gttctcaata 
tattttttaa 
ctgaaacata 
agtactggtg 
tttcatcaat 
tctataacta 
tttcgttcta 
agcatgtagt 
gagagaatgg 
aacttgcccc 
tgactgcaaa 
ctgtttatac 
ttgtgttaac 
cctttcataa 
aatatgtttt 
ccactttcca 
taggacagaa 



tgaacgacca 
tgaaaataat 
ccgggctttc 
tatagaaaac 
ctgtgaaaga 
tttttaacat 
gaaaactact 
acaaatggat 
tctttgtgat 
aatgaaataa 
cgtagctact 
ttaactgggt 
tataaaaata 
ttcatatgaa 
tagtaggtta 
gtgtcatgca 
ttaataaaac 
acaaccttct 
gaaaaaaatt 
gggtccaaaa 
tccaagggag 
ttttgaagga 
gaatggattg 
atccttttct 
gaaggggaga 
aaaggaacag 
ctagtaattt 
^99gcaattc 
t tagctaaaa 
ttatggaata 
atatgtttta 
gaatcagcca 
ggtctgaagg 
aagaaatccc 
atgcagccaa 
aattccctat 
tggaaaccag 
aacaatgcaa 
gaatcaccag 
atccctatta 
aacaaaagtt 
tccactatac 
tttgggagta 
attccctgaa 
agaagaatcg 
acctgcaggg 
cccattctgt 
aaagtttaaa 
ttctgatgaa 
tttacagagt 
aattattgac 
gatcagataa 
gactagacct 
acatgacaga 
ctattaatca 
aggccacaca 
ctcccagtaa 
ttgatttaga 
aatgccactc 
agaatgggct 
gtgaagctgc 
aggagacacc 
t:agaagaagc 



aaccaaatgt 
cttcaacatt 
ctaaattcaa 
atattttgaa 
acaaagcaat 
tttctttacc 
atcacttctg 
agccacattt 
atatatatta 
tgttttccag 
tctgggttga 
aattattttt 
tatatgaaaa 
gaaggcacaa 
atagtggttg 
atttctattg 
tctgtaagat 
ttaaagtttc 
gatttttgag 
tgggaggagg 
taccaagcat 
cactttgagg 
gagagattat 
agtctatgtg 
ggagagttat 
agaagcatct 
tcatatctga 
aatggctcct 
taatatttat 
actactatat 
gctaaagaga 
aaggggttaa 
accaaggaca 
ttgccaacat 
tttagcacac 
ctttcccact 
acaccatagg 
ggcacagaag 
cacacctata 
aaaagataaa 
agtttatctt 
ttgctgtaag 
aagtaaaata 
gacagtgtgg 
aaatattgtg 
atttggtctc 
aatcatgaaa 
atccggcagt 
ttagaaaatg 
gtttgcataa 
tgcctccaat 
aaatccttga 
ggatcattca 
tatattttaa 
aattttatgg 
ggtaggaact 
ctgtcccact 
acaatcaaga 
tttactatta 
acttttaggt 
cttctctaga 
actcttgtta 
aagcattcac 



18540 
18600 
18660 
18720 
18780 
18840 
18900 
18960 
19020 
19080 
19140 
19200 
19260 
19320 
19380 
19440 
19500 
19560 
19620 
19680 
19740 
19800 
19860 
19920 
19980 
20040 
20100 
20160 
20220 
20280 
20340 
20400 
20460 
20520 
20580 
20640 
20700 
20760 
20820 
20880 
20940 
21000 
21060 
21120 
21180 
21240 
21300 
21360 
21420 
21480 
21540 
21600 
21660 
21720 
21780 
21840 
21900 
21960 
22020 
22080 
22140 
22200 
22260 
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atgggaatgt 
ccatttctgt 
attgtgttct 
tggactattt 
aaaacagatg 
ttaaaaatgt 
gctgaagcgg 
aaccccgttt 
tcccagctac 
agtgagctga 
acacacacac 
gtaccatcaa 
gtgatataat 
catatatatt 
aaaagggatg 
agtgaagtaa 
taagctgtca 
acggtggagg 
gatgggtgca 
caccttttcc 
attggaccct 
tctaacttat 
tggacaagtt 
gtgttggcat 
tcatagaaga 
aaatgttgag 
gattatgatt 
caatgttttt 
tagtttgagg 
ttgatacact 
acttgtgtat 
ataaatacaa 
acaaaaatat 
ataagagggc 
ggaagatctg 
gggatgtata 
agggtttcaa 
ttacagtttc 
acatataaca 
gcagaaaaga 
tatattagtt 
acatgtattt 
cataaatgac 
agcaaacaca 
gacagatcgg 
cacagtaatc 
tccaatgtac 
tactcctcag 
ctttatccaa 
gatggcttat 
caggtgtgtc 
aagcacagac 
tttactgtgc 
ccgccatgtt 
cctacacttt 
ccttctccac 
gcatgtacct 
tttatatctt 
ttaaaagaag 
ttggtcatag 
aacaaatcaa 
agtgtatgtc 
gcatagaaat 



ctcagtgtag 
ccaattgcag 
tgttttttca 
attgcttgca 
aattgagaag 
aattaggcgg 
gtggattgcc 
ctactaaaaa 
tcgggaggtt 
gattgcacca 
acaaatgaaa 
acagagggat 
atataataca 
atatgtaata 
aagtaatggc 
ttcagaaata 
ggatgcaaag 
aggtgaggga 
ccaaaatctc 
cccaaaacct 
gtgagcttta 
taccctaggc 
ttgaacattg 
ggtagttatt 
aacacgtcga 
agattcaggt 
tcactttgat 
tcttagtttt 
ataagtttct 
attgttttct 
ctatatacca 
atatgtaata 
acaaaatgaa 
cagctgaggt 
gaaaatgatc 
ccaggcatca 
aggctgatac 
aaagaaaagc 
aagacttgca 
accacagaaa 
tagatgttta 
acattttgag 
aaacctaaat 
atgttaaaga 
gctgagctgc 
ggcaatgtga 
ttcttcctca 
atgctggtta 
tttcactttt 
gaccgctaca 
tgcctctgtc 
caccctgatg 
ggacccaccc 
ggtggtggct 
catcttcact 
ctgcgggtct 
gaggccccct 
tgtgagtccg 
tataaggaaa 
gcgttggaat 
tctgtcattg 
aaattattag 
tcaaatataa 



ggtgcatgtc 
aagcttctcc 
gaagaaagac 
ttacttcaac 
ttattttgaa 
ggcacggtgg 
tgaggacagg 
tacaaaaaaa 
gaggcagcag 
ttgcactcca 
tccagcacca 
agagaatgtg 
ccacagtttc 
tatatacaaa 
attcacagca 
aaataccaaa 
gcataagaat 
taaaaggcta 
acaaatcacc 
atggaaataa 
acaaggtaat 
agaagcccaa 
aatttacaga 
aaaaaaaagt 
ttgttccatc 
caggatattt 
ctgttacatg 
tcaagtgtta 
tcatcatttt 
ttatatccct 
tgcaaatgta 
tctactttca 
gaaatgtagc 
agaataaagt 
tccagttcac 
gtcccacttt 
tttagatccc 
aatgtttaca 
aaaagataaa 
accattttaa 
aagcatcaat 
agaagaggaa 
gcattttaat 
aaaaccatac 
agtcccttct 
gcatgatctt 
gtcacctctc 
actttttatc 
tcattgcact 
tggccatctg 
tggctgctgc 
cttcgtctgt 
ctcttagtcc 
ggttccaacc 
gccattctgc 
catgtgaccg 
tctgagacat 
atgttaaacc 
gttattcaaa 
ctgttcttat 
agtgtttttt 
ctagagctta 
gactagaaga 
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aggagagcac 
ctgagatttc 
tatttccaca 
ttccaataga 
gggtgattaa 
ctcccgcctg 
agttcaagac 
ttagctgaac 
tattgcttga 
gcctgggcga 
aaacagtatt 
gtatattata 
tttatccact 
atgtatatga 
acttggatgg 
tatagtgtgt 
gacacaaagg 
cacattgggt 
actaaagaac 
aaaataaaaa 
tatgattcta 
tgtccttttg 
gtgctttatt 
tagtataaca 
atgatcttca 
taagtatggt 
ttttaacaaa 
gaatgaggta 
atccttattg 
atcaacattt 
tatggtttat 
ttccccaaga 
ttggtcttta 
gaagaaaatg 
aggggcaggc 
cctaagcacg 
acattcaaag 
accatgggtt 
agataaggct 
atattattgc 
cacatgctca 
gaaatagcag 
ttccttttat 
agccgtgact 
ttttgtggta 
gttaatcaga 
ctttgtagat 
caagagaaaa 
ggtgattaca 
caagcccttg 
tccctatatt 
ccttctgtgg 
tcgcctgctc 
tcatttgctc 
gtatccacac 
ctgtcactgt 
ctatacaaca 
cattgatcta 
agaaactgtt 
tatctgacca 
gtcctttgta 
cactgccacc 
cattgttcta 



aggctcagat 
cttgctcaaa 
aaattctcag 
aaaaaaaaaa 
tatttcaata 
ttatcccagt 
cagcttggcc 
gtggtggtgg 
acctgggaga 
caagagtgaa 
tgcctatttg 
tatatatcat 
tgttgtttga 
tggaatacta 
gattggagac 
tctcactcat 
acttcgggga 
tcattgtata 
ttactcatgt 
taaaaaccaa 
ttgatttagg 
tgtagaatga 
tatgaagtga 
ttgtcaagtc 
tggtcacttt 
ctcagtcatg 
ttcatattga 
gagttaaaac 
cccccctccc 
aataccatca 
gtattatgtg 
accaattttt 
aagttaaaaa 
ggcacacatt 
aaagcgattc 
ttcagttgtg 
gtgtgttgtt 
caagaaaagt 
ctttaactat 
ctttgtatat 
ctaggctatt 
atgacaccac 
ttagatgtca 
gagtttgttc 
tttctagtca 
agtgactcga 
ctctgttata 
accatttcct 
gattattata 
ttatatggaa 
tatggctttg 
acccaatgac 
agatacttat 
tctcaccgtc 
tgctgagggg 
cttctatggg 

ggggaaaatt 

cagcctgagg 

tgctaagtaa 
attaatgaac 
atttgcatat 
tcagtaaatt 
gctctgtaaa 



gcctccaaag 
gcagtaggat 
ttaaattatt 
gactcataaa 
aaaaagccca 
actttgggag 
aacatagtga 
gcacctgtaa 
cagaggttgc 
actccgccac 
gcaattaaat 
acattataat 
tggatatata 
ctcagccata 
taatattcta 
aagtggaaac 
cttgggggaa 
ctgtttggat 
aaccaaatac 
aagaaattca 
tgacttatct 
gataatacgg 
cctgtttcca 
tggatatgtg 
tatcttgccc 
ttgttaatat 
acttgtgttc 
aacagcattt 
caggagccct 
atacaggtgc 
catatgtatt 
gcccccttgg 
agatttgttg 
agagggtgag 
ttgttctaca 
ataaacctgg 
aaacaaagaa 
ctaagtgaac 
caaaagactt 
taaaaaactc 
tcttaatgtc 
tggggtaatg 
tttgaagcca 
tcctgggact 
tctaccttat 
cactacacac 
ccaccaatgt 
tcatcggctg 
tgctcacagt 
gcaaaatgac 
caaatggtct 
atcaaccact 
gtcaaagaga 
atcctcattt 
aggcgcaagg 
acactgttct 
gtagctgttt 
aataaagacg 
ggtagatatt 
atttaaaatt 
gggacttaaa 
gaaaatgaaa 
aagtaatgaa 



22320 
22380 
22440 
22500 
22560 
22620 
22680 
22740 
22800 
22860 
22920 
22980 
23040 
23100 
23160 
23220 
23280 
23340 
23400 
23460 
23520 
23580 
23640 
23700 
23760 
23820 
23880 
23940 
24000 
24060 
24120 
24180 
24240 
24300 
24360 
24420 
24480 
24540 
24600 
24660 
24720 
24780 
24840 
24900 
24960 
25020 
25080 
25140 
25200 
25260 
25320 
25380 
25440 
25500 
25560 
25620 
25680 
25740 
25800 
25860 
25920 
25980 
26040 
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gtacagatga 
tactaatata 
ctaatatatc 
aatatatcat 
tatatcatag 
catagtacta 
tagtactata 
tagtactata 
gacagctcat 
ctacctaact 
agttgtaatc 
gagtgactac 
agaaacttgc 
aatgaagatc 
tcatagatcc 
agaaagagtt 
tcaattcttc 
atgttttatc 
gtgtgtattt 
tggtgagaat 
ctctgttgct 
cgggttcacg 
accacgcccg 
gatggtcttg 
tacaggcatg 
gttttcttat 
ggtgtagcaa 
tccaaagtta 
tgtctttttt 
atatattatt 
aatatacatc 
ttaaatcttc 
gatacatctc 
aaatttaatg 
atagaaatat 
tgtatctttc 
tcatgcagca 
gtgtatttat 
gctggggtgc 
ttctcctgtc 
ttttgtattt 
gaccttgtga 
gagttacatt 
attatgaatg 
aagatataag 
tatcaacagt 
tcagacttta 
ggttgaattg 
atttctcttt 
tttatttcat 
tgttatatgc 
catgtaatcc 
aggccaaact 
gttgtggtgg 
tgagcctggg 
tgacagagga 
aacaaaacaa 
gcctcagcaa 
tttggaattt 
gatttttttt 
gggattggat 
gttattaccc 
aggcctgtaa 



ttccattatt 
tcatagtact 
atagtactat 
agtactatat 
tactatatca 
atatatcata 
tatcatagta 
tatcatacta 
tacatttttt 
tctatgtatt 
ttctatgctt 
atttgctaat 
caaccatagt 
ctttaggtgt 
aatgaaaccc 
tttaattgtt 
actcttttat 
cacgaatgca 
tgattacaat 
caaggtcgta 
caggctggag 
ccattctcct 
gctaattttt 
atctcctgac 
agccaccgcg 
ttatttcaga 
ttgtatatat 
atattttatc 
tacttttagg 
aaattatttt 
taaaatagaa 
acagaatgcc 
tcctgactga 
tttcttatcc 
gtagtactga 
tgcaatttat 
gcaattttta 
tctacgattt 
aatggcacga 
tagcctccta 
ttaatagaga 
tcctcccgcc 
ttccagtttc 
tgtaccacgt 
tatctttaaa 
ttgtgagtgt 
aatgttttct 
ttattacaag 
cacttatatg 
acatttataa 
catttaacct 
cagtactttg 
acacaagata 
gatgcccttg 
aggtagaggc 
tggatgacag 
accccaaaaa 
atagccatag 
cagctcattt 
ttccacatag 
aaattatatg 
aaatagaaaa 
tcccagcctt 



agtactcata 
atatcatagt 
atcatagtac 
catagtacta 
tagtactgta 
gtactgtatc 
ctatatatca 
aaatatcata 
taaatgtttt 
cagaatctca 
atcttcgtga 
tacattttat 
agatgaggtt 
actttggatt 
tatttcataa 
tgtatgttgt 
taccattgcc 
tgtggatatt 
ataacatttt 
tgcctttttt 
tacagtggcg 
gcctcagcct 
tgtattttta 
ctcgtgatcc 
cccagctgct 
tatttctcaa 
atacacatga 
attttaaatt 
aagttaagtt 
catggtttcg 
aataatataa 
ccatttgttt 
ttctgtctct 
ttctgtagag 
tttgctacat 
ttatcttgca 
gtggtatagt 
tttttttttt 
tctcggctca 
gtagctggga 
cgggatttca 
tcagcctccc 
ttgataatat 
aaggagttcc 
tttactatat 
tacctttaca 
aatctgatgg 
tgaaactaag 
tccatatact 
tagttcctta 
taaaaaccta 
ggaggctgag 
gagactctgt 
tggtcccagc 
ttcagtgaac 
agcctggatg 
acctagggtg 
gtatctattt 
gacaagtatt 
tgaatgctta 
aagtgtgatc 
ttaacattta 
ttgggaggcc 
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gtactaatat 
actaatatat 
taatatatca 
atatatcata 
tcatagtact 
atagtactaa 
tagtactata 
aaagacagaa 
tgtttgtccc 
catgaggcag 
gtctgtttga 
aatacttaat 
tctgtttttc 
attactttaa 
ggcttttcat 
accagtaagg 
tgtttccttt 
tgtgtgtttt 
ctttttttta 
tttttttttt 
cgatctcggc 
cccgagtagc 
gtagagacgg 
gcccgcctcg 
tcacctgttt 
tgatataaac 
attccatttt 
ttctcaatat 
tatgccacct 
cttgttaatt 
agactccttt 
cagttatttc 
tgtcccattt 
ttacatcatt 
ttaaaacttg 
ctaaatattg 
ctatgccaat 
ttgagacaga 
ccgcaaactc 
ttacaggtgc 
ccatgttagc 
aaattgtatt 
gaaaatccat 
tctcaggtag 
attgccaatt 
tcactttatt 
gtaaacaatg 
tatttttaag 
ttgttaactt 
tgtaccaaat 
gggtaataat 
gcaggaggat 
ctttacaaaa 
tacatgggaa 
catgttgtca 
actccagcct 
agatgaacag 
ttctggtagc 
ttcataatag 
tatgtgagca 
ctagtatatt 
aaagagccaa 
aaggcccaca 



atcatagtac 
catagtacta 
tagtactata 
gtactatatc 
aatatatcat 
tatatcatag 
tatcatagta 
ttatcaattg 
tttgaatgaa 
cacatttagc 
aaagttggat 
aagcctccaa 
ttttatacat 
gttatatcct 
ggaaatgcca 
attcccatca 
gaataaataa 
atgtgtgtat 
attgcttatt 
ttttttttga 
tcactgaaag 
tgggactaca 
ggtttcaccg 
gcctcccaaa 
ttaaattgtg 
atgtttttgt 
atctcttcta 
ttttttattt 
atgatcatgt 
ttttcttatg 
gtaagcacaa 
agaaatttaa 
cttgctgtaa 
ggataggtgt 
aaataaatgc 
ttggatgtat 
atataaatat 
gtctcactct 
cgccccccag 
ccaccaccat 
caggatggtc 
ttatttctac 
taataaatgt 
gttggtaatt 
gttttactga 
cttgccaaaa 
aattctcatt 
tatattgatg 
ttcaggtagg 
gctacataca 
gccaggcagt 
tgcttgagcc 
aaataaaaaa 
gctgaggcag 
ccacgacact 
ggatgaccct 
atgaacgtta 
tgcaatttga 
atattttccc 
taaaatgttt 
gctttgccat 
caggctgggc 
gatcacttga 



tatatcatag 
tatcatagta 
tcatagtact 
atagtactaa 
agtactgtat 
tactgtatca 
ctatatatca 
ctgagaaaat 
tggaggatat 
acacttgttt 
aaatggcaat 
gctattttcc 
attatataga 
tagaatgata 
aatttccttt 
acatgtgaac 
attgaattga 
gtaggtgtgt 
caaatttttt 
cagagtattg 
ctccgcctcc 
ggcgcccgcc 
tgttagccag 
gtactgggat 
gtgtttatat 
tatatataca 
ttgtgtcttt 
cactttattg 
ggatttttac 
aactattttt 
ctaatcacta 
aacttacaaa 
actctattac 
ttatatattc 
atccactcta 
ctgtggtgaa 
gcctcaatat 
tgttgcccag 
ggtcaagcga 
gccaggctaa 
ttgatcacct 
tgttggtgaa 
tttgtgtgta 
tcttggtcat 
tttatatttc 
ctaggtgttg 
tggtgtttaa 
ctaatctcat 
gtgatttgtc 
aatcttttgg 
gtggctcaca 
caggagtttt 
aaattagcca 
gaggatcact 
ccagtctgga 
gtctcaataa 
tttcttgaat 
ttgggacaag 
agtaggattg 
cccacagatt 
tgaaatactg 
aaagtggctt 
gttcaggagt 



26100 
26160 
26220 
26280 
26340 
26400 
26460 
26520 
26580 
26640 
26700 
26760 
26820 
26880 
26940 
27000 
27060 
27120 
27180 
27240 
27300 
27360 
27420 
27480 
27540 
27600 
27660 
27720 
27780 
27840 
27900 
27960 
28020 
28080 
28140 
28200 
28260 
28320 
28380 
28440 
28500 
28560 
28620 
28680 
28740 
28800 
28860 
28920 
28980 
29040 
29100 
29160 
29220 
29280 
29340 
29400 
29460 
29520 
29580 
29640 
29700 
29760 
29820 
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tcgagactaa 
taggtctggt 
cttcagcctg 
ggtaacagag 
tgcctgtaat 
agaccagcct 
gtgcagtggt 
gaacccgggg 
gacagagcca 
tataataaga 
gaaattcaac 
attaaaaagg 
gagagaactt 
aaacagttgc 
agtcacacag 
tgtgagaaat 
ccagtctaca 
gaggtaccgg 
cgcaccctgc 
cagggagttc 
cactcccacc 
atatcccgca 
gcagtctgag 
ccaggcatgc 
gctcaaggag 
aaaagacagc 
gttctcccag 
ctgacccctg 
acctcacacg 
aaactaacaa 
agaccaaaag 
tctaaaaagc 
caaagctgga 
tactccaagc 
aatttagaag 
gagctgaaaa 
gatcaactgg 
gggaagttta 
tatgtgaaaa 
accaagttgg 
caggccaaca 
gcaactccaa 
agggcagcca 
gatctctcag 
aaagaaaaga 
ggagaaataa 
accctaaaag 
ctgcaaaatc 
cgagcaaaat 
actttaaaca 
aagagtcaag 
cataggctca 
gaaggggttg 
gacaaagaag 
ctaaatacat 
ctacaaagag 
acattagaca 
ctgcaccaag 
acattttttt 
gctcttctca 
gcaatcaaac 
ctgaacaacc 
atgttctttg 



cctgggcaac 
ggcaagtgtc 
ggaggcaaag 
taagatcctg 
cccagcactt 
ggccaagatg 
gggtgcctgt 
ggcagaggtt 
gactccaact 
atgccaacag 
tccaatcttt 
catttttttt 
catatttcaa 
acatacccat 
aaaattcata 
gaaacctatt 
gctcccagcg 
gttcatctca 
gcgagccgaa 
cctttcctag 
ccaatactgt 
Gctggctcag 
atcaaactgc 
ttaggtaaac 
gcctgcctgc 
agtaacctct 
cacgcagctg 
acccccgagc 
gccgggtact 
acagaaagga 
cagataaaac 
agagcacctc 
tggagaacga 
tacgggaggt 
aatgtataac 
ccaaggctcg 
aagaaagggt 
gagaaaaaag 
gaccaaatct 
aaaacactct 
tccagattca 
gacacataat 
gagagaaagg 
cagaaactct 
attttcaacc 
aatactttac 
agctcctgaa 
atgccaaaat 
aaccagttaa 
caaatggact 
acccatcagt 
aaataaaagg 
caatcctagt 
gccattacat 
atgcacccaa 
acttagactc 
gatcaacgag 
catacctaat 
cagcaccaca 
gcaaatgtaa 
tagaactcag 
tactcctgaa 
aaaccaacga 



atgaggaaac 
tgtagtccaa 
gttgcagtga 
tcttaaaaaa 
tgggaggccg 
gcgaaacccc 
aattccagct 
gccatgagcc 
caaaaataaa 
aataattctg 
taattttatt 
cttaaccaat 
ggcacagccc 
tcctgtgatt 
ataatacaat 
gtatcggtta 
tgagcgacgc 
ctagggagtg 
gcagggtgag 
tcaaagaaag 
gcttttccaa 
agagtcctat 
aaggcggcag 
aaagcagcca 
ctctgtaggc 
gcggacttaa 
gagatctgag 
agcataactg 
ccaacagacc 
catccacacc 
cacaaagata 
tcctcctcca 
ctttgacgag 
cattcaaacc 
tagaataaca 
agaactacgt 
atcagcaatg 
aataaaaaga 
atgtctgatt 
gcaggatatt 
ggaaatacag 
tgtcagatac 
tcgggttacc 
acaagccaga 
cagaatttca 
agacaaccaa 
ggaagcacta 
gtaaagacca 
catcataatt 
aaatgctcca 
gtactgtatt 
atggaggaag 
ctctgataaa 
aatggtaaag 
tacaggagca 
ccacacatta 
acagaaagtc 
agacatctac 
ccacacctgc 
aagaacagaa 
gattaagaat 
tgactactgg 
gaacaaagac 
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cctgtctcta 
gctactcagg 
gctgagatca 
ataataaaat 
aggcaggtga 
gtctctacta 
acttgggagg 
aagatcgtgc 
taaataaata 
atttcagcta 
ctcctttatt 
agttttatcc 
aaatatcaac 
gtgtagatat 
gccatgtttc 
agtactgaag 
agaagatggg 
gcagacagtg 
gcattgcctc 
gggtgacaga 
cgggcttaaa 
gcccacggag 
cgaggctggg 
ggaagctcga 
tccacctctg 
atgtccctga 
aaagggcaga 
ggaggcaccc 
tgcagctaag 
gaaaacccat 
gggaaaaaac 
aaggaaagca 
atgagagaag 
aaaggcaaag 
aatagagaga 
gaagaatgaa 
gaagatgaat 
aatgagcaaa 
ggtgtacctg 
atccaagaga 
agcacaccac 
accaaagatg 
ttcaaaggga 
agagagtggg 
tatccagcca 
atgctgagag 
aacatggaaa 
tcaagactag 
acaggatgaa 
attaaaagac 
caggaaaccc 
atctaccaag 
acagacttta 
ggatcaattc 
cccagattca 
ataatgggag 
aacaaggata 
agaactctcc 
tccaaaattg 
attataacaa 
ctcactcaaa 
gtacataacg 
acaacatacc 



aaaaaaccac 
aggctgatgt 
tgccagtgca 
aggccaggtg 
atcacaaggt 
aaaatacaaa 
ctgaggcagg 
cactgtactc 
aataaaataa 
cagtttatgc 
ctgttgttca 
ttattgctgt 
tcccttgtcc 
tgcaactcta 
ctaatatatt 
atggccgaat 
tgatttctgc 
ggccgaggtc 
actctggaag 
tggcacctgg 
aaatggcaca 
tctcactgat 
ggaggggtgc 
actgggtgga 
ggggcaggac 
cagctttgaa 
cttcctcctc 
cccagcatgg 
ggtcctgtct 
ctgtacatca 
agagcagaaa 
gttcctcacc 
aaggcttcag 
aagttgaaaa 
agtgcttaaa 
gaagcctcag 
tgaatgaaat 
gcctccaaga 
aaagtgacag 
acttccccaa 
aaagatactc 
aaatgaagga 
agcccatcag 
ggccaatatt 
aactaagctt 
attttgtaac 
ggaacaaccg 
gaagaaactg 
attcacacat 
atagactggc 
atctcacatg 
caaatggaaa 
aaacaacaaa 
aacaagaaga 
taaagcaagt 
actttaacac 
cccaggaatt 
accccaaatc 
accacatact 
actatctctc 
accactcaac 
aaatgaaggc 
agaatctctg 



aaaaattagt 
gggaggatgg 
ctccagcctg 
cagtagttca 
caggagttcg 
aattagccgg 
agaatcgctt 
tagccttggt 
taaaatataa 
ttttttatga 
gacattgcat 
ccttattctg 
tttcatgagc 
tttgtctgtt 
ctgagagttc 
aggaacagct 
atttccatct 
agtgggtgcg 
cacaaggggt 
aaaatcgggt 
ccaggagatt 
tgctagcaca 
ccaccattgc 
gcccaccaca 
acagacaaac 
gagagcagtg 
aagtgggtcc 
gcagactgac 
gttagaagga 
ccatcatcaa 
aactggaaac 
agcaacggaa 
atgatcaaat 
ctttgaaaaa 
ggagctgatg 
gagccgatgc 
gaagcaagaa 
aatatgggac 
ggagaatgga 
tctagcaagg 
ctcgagaaga 
aaaaatgtta 
actaacagcg 
caacattatt 
cataagtgaa 
caccaggcct 
gtaccagccg 
catcaactaa 
aacaatatta 
aaattggata 
cagagacaca 
acaaaaaaag 
gatcaaaaga 
gctaactatc 
cctgagtgac 
cccactgtca 
gaactcagct 
aacagaatat 
tggaagtaaa 
agaccacagt 
tacatggaaa 
agaaataaag 
ggacacattc 



29880 
29940 
30000 
30060 
30120 
30180 
30240 
30300 
30360 
30420 
30480 
30540 
30600 
30660 
30720 
30780 
30840 
30900 
30960 
31020 
31080 
31140 
31200 
31260 
31320 
31380 
31440 
31500 
31560 
31620 
31680 
31740 
31800 
31860 
31920 
31980 
32040 
32100 
32160 
32220 
32280 
32340 
32400 
32460 
32520 
32580 
32640 
32700 
32760 
32820 
32880 
32940 
33000 
33060 
33120 
33180 
33240 
33300 
33360 
33420 
33480 
33540 
33600 
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aaagcagtgt 
tccaaaattg 
tcaaaagcta 
acacaaaaaa 
aaaattgata 
gcaataaaaa 
agaggatact 
ttcctcaaca 
caataacagg 
gaccagatgg 
ttctgaaact 
ccagcatcat 
caatatcctt 
agcagcacat 
gctggttcaa 
aaaaccacat 
tcatgctaaa 
ctatctatga 
ctttgaaaac 
tggaagttct 
aagaggaagt 
tcatctcagc 
aaatcaatgt 
aaatcatgag 
aacttacaag 
taaaagagga 
tcttgaaaat 
taccaagggc 
aaagagcccg 
tacctgactt 
aaaacagaga 
acaactatct 
ttaataaatg 
tccttacacc 
aaaccataaa 
aggactttat 
atctaattaa 
aacctacaaa 
gaatctacaa 
gtgaaggaca 
aaaaaatgct 
catctcacac 
ggatgtggag 
ttgtggaagt 
gccatcccat 
tgcacacgta 
tgtccaacaa 
gcagccataa 
atcattctca 
gatgggaatt 
cggttgtggg 
gacgagttag 
acattgtaca 
aaacagaaac 
aattgagaaa 
tgacatttgg 
aaaagttctt 
aaggtgtcct 
tctctttctc 
tttggtttag 
ctttcaggac 
gcctgaaatg 
gtatgttcat 



gtagagggaa 
acaccctaac 
gcagaaggca 
cccttcaaaa 
aaccgctagc 
atgataaagg 
acaaacacct 
catacactct 
atctgaaatt 
attcacagcc 
attccaatca 
cccgatacca 
gatgaacatt 
caaaaagctt 
tatatgcaaa 
gattatctca 
aactctcaat 
caaacccaca 
tggcacaagg 
ggccagggca 
caaattgtcc 
cccaaatctc 
gcaaaaatca 
tgaaatccca 
ggatgtgaag 
tacaaacaaa 
gtccatactg 
tttcttcaca 
catcgccaag 
caaactatac 
tatagatgaa 
gatctttgac 
gtgcagggaa 
ttatacaaaa 
aaccctagaa 
gtctaaaaca 
actaaagagc 
atgggagaaa 
tgaactcaaa 
tgaacagaca 
catcatcact 
cagctagaat 
aaataggaac 
cagtgtggcg 
tactgggtat 
tgtttattgc 
tgatagactg 
aaaatgatga 
gtaaactatc 
gaacaatgag 
gtggggggag 
tgggtgcagg 
catgtaccct 
attatatcta 
gaataattac 
agctgaaaat 
ttatatatgg 
tgttctttgg 
tctgttgccc 
cttatattgt 
aatctcagta 
gctcaagtgc 
gtatggtgct 



atttatagca 
atcacaatta 
agaaataact 
aattaatgaa 
aagactaata 
ggatatcacc 
ctatgcaaat 
cccaaactaa 
gtggcaataa 
gaattccacc 
atagaaaaag 
aagcctggca 
gatgcaaaaa 
atccaccatg 
tcaataaatg 
atagatgcag 
aaattaggta 
gccaatatca 
cagggatgcc 
attaggcagg 
ctgtttgcag 
cttaagctga 
caagcattct 
ttcacagttg 
gacctcttca 
tggaagaaca 
cccaaggtaa 
gaattggaaa 
tcaatcctaa 
tacaaggcta 
tggaacagaa 
aaacctgagg 
aactggctag 
atcaattcaa 
gaaaacctag 
ccaaaagcaa 
ttctgcacag 
attttcacaa 
caaatttaca 
cttctcaaaa 
ggccatcaga 
ggcaatcatt 
acttttacac 
attcctcagg 
atacccaaag 
ggcattattc 
gattaagaaa 
gttcatgtcc 
gcaagaacaa 
aacacatgga 
gggggaggga 
gcaccagcaa 
aaaacttaaa 
tctccttgtt 
aattgaaggc 
atttaatagc 
ttgggtgatc 
acatattttc 
tctctgtatc 
gtcttacaga 
cagacaaggc 
acagcaggtg 
ggtggaggtg 
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ctgaatgccc 
aaagaactag 
gaaatcagag 
tccaggagct 
aagaaaaaaa 
accgatccca 
agactagaaa 
accaggaaga 
tcaatagctt 
agaggtacaa 
agggaatcct 
gagacacaac 
tcctcagtaa 
atcaagtgga 
taatccagca 
aaaaggcctt 
ttgatgggac 
tactgaatgg 
ctctctcacc 
agaaggaaat 
atgacatgat 
taagcaactt 
tatacaccaa 
cttcaaagag 
aggagaacta 
ttccatgctc 
tttacagatt 
aaactacttt 
gccaaaagaa 
cagtaaccaa 
cagagccctt 
aaaacaagca 
ccatatgtag 
gatggattaa 
gcattaccat 
tggcaacaaa 
caaaagaaac 
cctactcatc 
agaaaaacaa 
gaagacattt 
gaaatgcaaa 
aaaaagtcag 
tgttggtggg 
gatctagaac 
gactataaat 
acaatagcaa 
atgtggcaca 
tttgtaggga 
aaaaccaaac 
cacaggaagg 
ttgcattggg 
ggcacatgta 
gtataataat 
ttcatagcaa 
acatagtgat 
tacaatcttt 
tgttggataa 
agtgttgtga 
actgcctttc 
ctcttcacat 
tgctgcttgt 
tatgtgtgtt 
caatcataca 



acaagagaaa 
aaaagcaaga 
cagaactgaa 
ggttttttga 
gagagaagaa 
cagaaataca 
atctagaaga 
agttgaatct 
accaacgaaa 
ggaggaactg 
ccctatctca 
caaaaaagag 
aatactggca 
cttcatccct 
tataaacaga 
tgacaaaatt 
gtatttcaaa 
gcaaaaactg 
actcctattc 
aaagggtatt 
tgtatatcta 
cagcaaagtc 
caacagacaa 
aataaaatac 
caaaccactg 
atgggtagga 
caatgccatc 
aaagttcata 
caaagctgga 
aacagcatgg 
agaaataacg 
atggggaaag 
aaagctgaaa 
agacttaaat 
tcaggacata 
agccaaaatt 
taccatcaga 
tgacaaaggg 
acaaccctgt 
atgcagccaa 
tcaaaagcac 
gaaactacag 
actgtaaact 
tagaaaaacc 
catgctgcta 
agacttggaa 
tatacaccat 
tatggatgaa 
actgcatatc 
ggaacatcac 
agatatacct 
tacatatgta 
agtaataata 
gtcatcaata 
gaaagaaaaa 
aaaagtcagt 
tgttttattc 
gttgtctttc 
tctcaattcc 
gttggcctca 
ttttggaaat 
tgttggtgtg 
acgagaataa 



gcaggaaaga 
gcaaacacat 
ggaaatggag 
aaggaticaac 
tcaaatagag 
aactaccatc 
aatggataaa 
ctgaataggc 
aagagttcag 
gtaccattcc 
ttttatgagg 
aattttagac 
aaccgaatcc 
gggatgcaag 
accaaagaca 
caacaacact 
ataataagag 
gaagcattcc 
aatatagtgt 
caattaggaa 
gaaaacccca 
tcagaataca 
acagagagcc 
ctaggaatcc 
ctcaaggaaa 
agaattaata 
cccatcaagc 
tggaaccgaa 
ggcatcacac 
tactggtagc 
ccgcatatct 
gattccctat 
ctggatccct 
gttagaccta 
ggcatgggca 
tacaaatggg 
gtgaacaggc 
ctaatatcca 
caaaaagtgg 
aagacacatg 
aatgagatac 
gtgctggaga 
agttcaacca 
atttgaccca 
taaagacaca 
ccaacggaaa 
ggaatactat 
attggaaatc 
ctcactcata 
actctgggga 
aatgctagat 
actaacctgc 
ataaattaaa 
catgtttcta 
ttgttgattc 
atttgaaatg 
ttcttaaagc 
cttgtctctg 
tatttcataa 
gccttacata 
gactcctctt 
tatgtattgt 
tttttgctct 



33660 
33720 
33780 
33840 
33900 
33960 
34020 
34080 
34140 
34200 
34260 
34320 
34380 
34440 
34500 
34560 
34620 
34680 
34740 
34800 
34860 
34920 
34980 
35040 
35100 
35160 
35220 
35280 
35340 
35400 
35460 
35520 
35580 
35640 
35700 
35760 
35820 
35880 
35940 
36000 
36060 
36120 
36180 
36240 
36300 
36360 
36420 
36480 
36540 
36600 
36660 
36720 
36780 
36840 
36900 
36960 
37020 
37080 
37140 
37200 
37260 
37320 
37380 
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acaatagcta 
ctgttcatta 
gatagtttat 
agatacatta 
gcactgccta 
ttttcgcact 
gactgatgca 
ttgcttgctg 
gtagtatatg 
gaggatcaga 
aaccttaaaa 
catgtgtttt 
catgcttatt 
ttgaaaagtc 
aaggccaata 
atctaagttt 
cttccatgat 
ctctcaactt 
ttagctttcg 
taactttgtg 
atctagcaag 
atttatgggg 
gtttgacaaa 
tgggaagtgt 
tcaggaattt 
agtatccaga 
gacagttttg 
agatagatag 
agagtagtaa 
atgtttgaga 
atttctgtgt 
ccagtttttc 
gtttcagttg 
ccattgaagt 
taaggacaga 
tgaactctaa 
aggaagaggc 
gaaaggagtg 
ttgggagcta 
tgttggcagc 
gatatggcct 
ggttgaggcc 
aatctttgtt 
tgttgaattt 
gctcacaact 
ttgtgggaga 
tagtcttcct 
gttttaaaat 
attcacaatt 
agctattctt 
gaatttttat 
aaatatttta 
acctttgttc 
tctgtacact 
cttagtaact 
ttttcaagaa 
gttaacctga 
ctaattaatt 
atggacattt 
tgtgttttta 
tgctgaaatt 
ttgactatat 
ctttaatttc 



ctaccattta 
acaaacttct 
cctgttcaga 
gatgatctca 
ttataatgca 
ggtttgctat 
ttgtcatttt 
actgtctatt 
atactcgcac 
acaaaatcag 
atgtgactgt 
ctcttttctg 
cagtgtacct 
actggtcaca 
accatttgcc 
ctcttcttct 
actggggaca 
gaattatttg 
atgcttaagc 
gctctgtttt 
taatttgatt 
taaattcact 
tgtgtacagc 
gagaataggg 
tttcttgaat 
aacagttgac 
ggaaggtcat 
atgatagatt 
cccattatag 
ctctgtatgg 
catagtttta 
acagcccatt 
cttcacatta 
gagtgtgaag 
ttttattcag 
ttttgttttc 
atgagcaggg 
ttgatccatg 
gtagacaaga 
cttgagtttt 
tgagctgtta 
tagtcaagaa 
atgcggtgtt 
cttttcatgt 
tagactttct 
tatttatatg 
agtctatggt 
ttgatgacac 
agtgctttta 
ccattttttt 
ttccaccaat 
atgaacatat 
aaaaatccat 
taacctcttc 
cttaggtgga 
agccttcaga 
ttttacttca 
agttatgtaa 
aagtagtgca 
tgacagcatt 
tgatagaagt 
tgaatattct 
gctctgcaga 



ttttactttt 
agttaggtaa 
tgtgagaaag 
catgcacaat 
tagaattctg 
gctaatattg 
ccccataact 
tgttctaaca 
ataattgttt 
atcataagac 
tttccttctc 
tttctgttat 
gcttttcaag 
tcagcccaca 
ttgaccacac 
tctctccttc 
aaaaaagaga 
ttttaaaaaa 
ttttgacttg 
ctatttggtt 
tgcttaattc 
tacattaaat 
catgttatca 
caagtaaaga 
aaatactcac 
ttttgacttt 
tactatgtca 
catcaagatt 
ggatatacta 
atatgtcttc 
tgatgaacta 
gaattatttt 
tcattgttaa 
tgttacaaag 
taatatacta 
acagagataa 
ctcaagagag 
tgaaacccac 
gctctttctc 
ctcaggcagg 
gccactgtgt 
agtgctcaga 
tcataatggg 
gatcgtttgc 
tttttttttt 
ttcttaatac 
ttgccttttc 
acacacactt 
ttttcttcct 
ttttttcctg 
ttagttttga 
ccatattttc 
tggctacata 
atcatcctta 
tgatattaac 
aacccctgtt 
attacaatga 
atactacaat 
aattatgcaa 
tattttagaa 
ggctttaaaa 
aatgcacaag 
cgtttgtgct 
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tgcagtaaaa 
gaaggtgtta 
aatgttaaaa 
tcctatgtcc 
ttggtggaaa 
ttatatgttg 
tgccatatta 
ttatttttta 
ggactaaatt 
atgacagaac 
ttttttgagg 
actgcctgac 
gaaaatgtaa 
tggatactct 
ccaattccta 
ttcttatttt 
agataagaga 
gtatagggga 
ttgtaatcct 
ttgtttttaa 
tgaggaaaga 
gtaggaactt 
caattccaat 
tgccacaaag 
ggaattgttg 
tttttagtgt 
atcagaaagt 
gtatgaacag 
taattggttc 
ttttgccttg 
taaatttata 
cattattata 
tctttgtatt 
ataaacaaag 
ttgcaatgga 
ctgtgtattc 
tcagagaagt 
ccagttttgc 
cacatattgg 
ttcttccagg 
ttgtgttttg 
ggattctttc 
cataatttac 
tgttcatgta 
aattgaactg 
caatcatttg 
attatcttag 
tatatatata 
aagaaatctt 
gaggacttgt 
gtattggcat 
cagtagcagt 
tacgagtgga 
gatccacaca 
ttaagggaca 
caacttgtgt 
tcaattgatt 
ggtcagataa 
caatatttgt 
tcagatcatg 
ttattgatag 
tgagacgtat 
tttcagtgtg 



tttttttctt 
ttaactaaga 
ctagtgtcta 
attttgcaaa 
aatacaagtg 
aaagtgcatg 
gatcactaca 
aaacaatcag 
agcagtttga 
acatatagct 
cttcttagag 
agaaacatca 
tttgtgtgtt 
ggattcttat 
cttaaaactc 
ttttttttct 
tgaactagtg 
aaatgggttt 
gggtatggaa 
tgattctctg 
aaaaatttcc 
taagtgtgca 
tttcaaggct 
ttcactgtcc 
caagcttttg 
tctcatggtt 
tgattcttag 
tttgttcatt 
atctatgcat 
gataaatact 
taaaatttta 
tcttgttatg 
gtcaatattt 
ccagctacca 
caggtaggtt 
taaaggtaga 
aaaaaaatta 
caactggctc 
ctgaaacaga 
agtactacag 
ttcaagtctt 
tagagtttgg 
attttcttga 
tcttctttta 
tttatatttg 
tcagaattac 
cattattttg 
taattatata 
tgctttcctc 
agatttagca 
gtggtgtgtt 
ctccattaaa 
tctatttgtg 
ctgctttgaa 
gtttttcaca 
ttgaagacat 
gatgtctgct 
ttggacacat 
tcactttcca 
aagttttgca 
atttaggaag 
gtctttatta 
tatcttttgc 



acccactgct 
gtttgttaat 
ctctattagg 
catatttact 
tgtccagaca 
atttgagatt 
ttaatttatt 
aaacttctta 
ccatccttat 
ttgagactaa 
ttgaaagctt 
ctttacttac 
ggaaaggaat 
ctccatagag 
tttatctaac 
tttaaccagg 
gctttctgga 
gaggaacacc 
agaactctga 
gatgttaaac 
tttgagttaa 
gttcagtaaa 
actgtgcagc 
ttaccaagat 
attactttct 
ttcatggagg 
atagatagat 
tttattgctg 
ctgctgatac 
gaagagcaaa 
taaagggcca 
tgttataaga 
tttattcctc 
gttaaagtag 
cactgtaaac 
atgaggaaat 
caaaaagtgt 
ctaccctccc 
cagtatcttt 
tcatcccagg 
tttaggccaa 
tccaggagga 
caaatgcaga 
tgaagtgtct 
gaccagtgaa 
ttactatgaa 
aagaatgcgt 
tatatatttt 
aagatcataa 
tttacattta 
ataggaagtt 
atacctttgc 
atctctttat 
tgctatactt 
tactttattt 
taaatgttta 
agaagtgcac 
ggtactgacc 

aacatcatcc 
aattaacaca 
tttaggtctt 
acacttttta 



37440 
37500 
37560 
37620 
37680 
37740 
37800 
37860 
37920 
37980 
38040 
38100 
38160 
38220 
38280 
38340 
38400 
38460 
38520 
38580 
38640 
38700 
38760 
38820 
38880 
38940 
39000 
39060 
39120 
39180 
39240 
39300 
39360 
39420 
39480 
39540 
39600 
39660 
39720 
39780 
39840 
39900 
39960 
40020 
40080 
40140 
40200 
40260 
40320 
40380 
40440 
40500 
40560 
40620 
40680 
40740 
40800 
40860 
40920 
40980 
41040 
41100 
41160 
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tttttgtttc 
gtattgcttt 
ttttttagac 
ctatgcaatg 
ggggctacag 
ctccctacct 
agccatcgca 
atcatcttcc 
taaagcatca 
tatttagtag 
tagccaattg 
aactgtctga 
atacatcaat 
tcagtaatat 
gcatgtgtaa 
accaaatcta 
ccatattaaa 
ttgtctatcc 
taactcttat 
tgaatataca 
cattctgttt 
tgagagacac 
taccatttta 
tattatactt 
catgtgccat 
taatgctatc 
ttcctatgtc 
ttggttttct 
catccatgtc 
ggtgtatatg 
tcaagtcttt 
gcagcatgat 
aataccattt 
taaggcttta 
gggatgtaac 
tcaaagaata 
agagcccatt 
ctaatgaaca 
aatattcttc 
tttataaacg 
tagggcagac 
gaccaatact 
tacaaatcct 
gtaaaggacc 
tggttttatt 
ttgaaagtga 
actgtacata 
atcaaataaa 
ctttaaaaat 
aactatggtg 
aactttttct 
atttcactgg 
gcaagctcct 
ttgagcaata 
gagttggcag 
agtagtttag 
taattgttta 
ctaaaagaat 
tgtatctttc 
ctggaaaatt 
tatgcagtca 
ttggttttgt 
tgacaatcat 



tgagtagctt 
tatgtaactg 
agagtcccgc 
tctgcctccc 
gcacgttgcc 
caaataatct 
cctggcctgt 
aaaatataca 
tgtcatctgt 
gtaggacttt 
agtaatcaca 
acatggtgtc 
tggtccttcc 
agatagataa 
aactcatctt 
taagtaaatt 
atttttgaga 
attcagttat 
gttttattga 
tatacataca 
aacaaagagc 
cagtattata 
tttttattta 
taagttttag 
gttggtgtgc 
cctcccccct 
catgtgttct 
gtccttgcga 
cctacaaagg 
tgccacattt 
gctattgtga 
ttataatcct 
tacttttaaa 
aactcaattt 
aaatctacat 
accaccattt 
ttgattcatt 
catgtctagc 
cttctatttt 
gtcctggtct 
tagttccagc 
tctgcttcct 
tcctttcctc 
tcaccccagt 
atattaaaat 
tggctagttt 
aatgttgtat 
tgttaagccc 
actctaggga 
agttaacatt 
gtgtgtggag 
acttttctat 
ctgatgtata 
atttttacag 
caaagactgc 
tggcatataa 
tttactgttc 
ctcacattca 
tctgtgggtt 
tgttcactcc 
gtgtacatga 
atattcttag 
aggtttctgt 



atgattttat 
tatgagtgtt 
tctgtcaccc 
aggttcaacg 
atcacaccgg 
gcccacctca 
atgagtggtt 
cttaatttcc 
gaataacaca 
cagtaccata 
aacacacaag 
atttgtatag 
agcagaagca 
tataaaactt 
tgagttagta 
aatgtcacct 
ctttgctctt 
tattgagcca 
aatgttattg 
cacatactat 
caaattgtca 
gtcctatatg 
tttatttatt 
ggtacatgtg 
tgcactcatt 
ccccccactc 
cattgttcaa 
tagtttgctg 
acaagaactc 
tcttaatcca 
atagtgcccc 
ttgggtatat 
ttagtgcaag 
gaggaaaaaa 
tttgattttg 
gtgatctaag 
tgtgctaacc 
ttcccttatc 
cagccctatt 
cagacctgtc 
ataaatccag 
tctactgatt 
aagtctttca 
gtctatagac 
tacaaaactt 
ctctgtttgt 
agggatttca 
ttgaggaaca 
atatagctaa 
aacaggtagg 
attagtatcc 
tcatatgtta 
tgtgagagct 
tcatagctta 
ttttaacaat 
atggagacaa 
aaatgttgca 
ttgttcttca 
taaccattat 
aataacccca 
ataaacaggc 
ccatatatca 
ttgtttgttt 
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gtgatatttt 
tttgggtttt 
agattggagt 
gattctcctg 
tttcactatg 
ggctcccaaa 
tttgtgtatt 
catttgtata 
gatttgtttc 
ttgaaccgac 
taaaatctta 
ctaaatccat 
aagtgacatc 
tctgaaggct 
taatatacat 
tctaatgtgt 
cctcatcaaa 
gaatcatttg 
catattaaca 
agggcatatt 
attagtttta 
acccttcatt 
tatttatttt 
cacaatgtgc 
aactcgtcat 
cacaacagtc 
ttccaaccta 
agaatttgct 
ttcatttttt 
gtctatcgtt 
aataaacaca 
actcagtaat 
gaatcaaaac 
atagctatgc 
atatatagct 
agaattaggg 
tggaactatc 
atctaacaaa 
cctaagtcaa 
tttgattctt 
aacgtccgtc 
atgttgtcac 
gctgccacac 
aaaactggag 
atctatctat 
gctaaagctc 
tcttcaaagg 
aatgatatta 
gaaatatata 
tcatttgcta 
aatcaataga 
acatttcatt 
aaataagcaa 
ttttcatgaa 
gttctgtgat 
aaaaacttca 
ttctcagaaa 
cagacccagc 
ttctcatatt 
ggaattgtaa 
aggtaaaatt 
agcactaagt 
gtttgtttgt 



taagttttat 
tttattttgt 
gaagtggtat 
cctctgcctc 
ttggctaggc 
gtgctgggat 
gaatttgtaa 
ttttaaaata 
ttttcaattt 
tacaactttt 
agggccaata 
taggtgttcc 
gactaaataa 
ttctatgaaa 
agctaaatct 
ggacatctga 
aataaccaat 
tagtcagaag 
tatgtaagaa 
gccttcaaat 
tgcattaagg 
ctatagaaat 
ttattttatt 
aggtttgtta 
ttagcattag 
cctggagcgt 
tgagtgagaa 
gaaaattgag 
atggctgcat 
gttggacatt 
cgtgtgcatg 
gggatggctg 
catacagata 
tcaaatctct 
atattgtgtt 
atgcctcttc 
agtgcactgg 
tgcagtgtaa 
caattttatg 
agcttgttaa 
ctccaaagga 
tctgctgctc 
ctttctaccc 
agttctgatg 
attttccttc 
tgatgaacac 
agaatctagc 
taattggttc 
tcttgccacc 
catttgttca 
ggaaacccat 
gactctaaaa 
tttctttagg 
gaactcagag 
tctctaacaa 
ctctcttcct 
aggctacttt 
ttttacttga 
ataatgtgag 
tacacacaaa 
tcaaagtgaa 
ttaactaatt 
ttgagacgga 



attctactgt 
tttgttttgt 
aatcttggct 
ccaagtagct 
tagtctcaaa 
tacagttgtg 
ttgtaatcat 
acttttataa 
gtatactttt 
aaataaaaaa 
ggatgtatcc 
actagcagac 
actaatgtgt 
cataaaataa 
gtattcccta 
tcatgtgatc 
ccattacttt 
agtcattttg 
aacaaataaa 
tgaaagctgc 
gatgaggaaa 
atttaagaaa 
ttattattat 
catatgtata 
gtatatctcc 
gatgttcccc 
catgtggtgt 
tttccagttt 
agtattccat 
taggttggtt 
tgtctttata 
ggtcaaaaga 
aaaactgaga 
caaatttcat 
gactgatttc 
tgtcttggtc 
aaaatacaga 
acaagtttta 
actcatcaat 
tgactttcaa 
cctgggccct 
tccacatcct 
ctcaccctca 
aaaaccttct 
ttccatatta 
atgtaacaaa 
tgttcaaact 
tttagaatta 
ttgctttaca 
ctaaaaatac 
attttcatca 
tcaattatat 
gtgcaaattt 
agagtggctt 
tattttaata 
tcaggtgcta 
gaatagtcat 
aattataata 
ctccatgaaa 
aaaatcaagc 
aaaacagttc 
taatgttcac 
gtctcgctct 



41220 
41280 
41340 
41400 
41460 
41520 
41580 
41640 
41700 
41760 
41820 
41880 
41940 
42000 
42060 
42120 
42180 
42240 
42300 
42360 
42420 
42480 
42540 
42600 
42660 
42720 
42780 
42840 
42900 
42960 
43020 
43080 
43140 
43200 
43260 
43320 
43380 
43440 
43500 
43560 
43620 
43680 
43740 
43800 
43860 
43920 
43980 
44040 
44100 
44160 
44220 
44280 
44340 
44400 
44460 
44520 
44580 
44640 
44700 
44760 
44820 
44880 
44940 
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gtcgcccaga 
ttcacgccat 
gcctggctac 
tcttgatctc 
gcgtaagcca 
tttttcttct 
tggacttcct 
ttgttttttt 
aagacaagta 
atagagaatg 
catataaaga 
gcctagatga 
tcagagagaa 
ccccaaacca 
tagagaagat 
tgtgcatgat 
ttggtcacct 
acaatttcct 
tcttcatcgc 
atgtagccat 
ctctggtcac 
cctttcactt 
ctcttatcat 
caggctttac 
cagcgatctt 
cccacctgac 
catcagagaa 
caatgctgaa 
aaatgattag 
ccctaagtgc 
cagtatgggc 
gtttttaaaa 
atgaatataa 
gggggtaaac 
tattactatc 
aaaaaatgat 
ataggtggat 
gtctctccta 
ctcaggaggc 
agatcacgcc 
aataaataaa 
tgtggtgtct 
atgtaaatat 
tgtaaacatc 
ttctgaaaat 
atttctatca 
gacttaggaa 
aaatgacttt 
tctacagagg 
aatgctactc 
agaaagcaag 
cagactaaag 
tgtgttgatt 
tgataaaagc 
gcgtgttctt 
gtcatgaccg 
gtcaccctgg 
taggggcgta 
tggcacatta 
aatcacaaag 
gactgcatat 
taatattttg 
tatgatatca 



ctggagtgcc 
tctcctgcct 
ttttttgtat 
ctgacctcgt 
ccgcgcccgg 
gttttgtatc 
acatttattt 
tgttccctca 
taaatgcatt 
tgcctgtgtg 
gaaaacgttg 
atagatgaat 
tcctcacatt 
caccatagtg 
cctgtttggg 
cctgctgatc 
ctccttttta 
ctcagaacag 
cctagtgatc 
ttgcagccct 
tgtgccttac 
atccttctgt 
gctggcctgc 
tctctcaagc 
caggatccgt 
aatagtcact 
gtctgtagag 
cccattgatc 
gggaaaatcc 
ctgtggggta 
tcttagtaac 
ataaaaagct 
attgtttggt 
ttttaactaa 
catagagcca 
gccgggcatg 
cacctgaggt 
aaattacaaa 
tgaggctgga 
attgtactta 
taaataaata 
tgaacaatag 
atgaatcacc 
gtgagaagct 
caactctcaa 
tcaaaatact 
taatttagta 
gttagtggaa 
aaaaatttca 
caaagagatc 
tggaggaatg 
taagttgtgg 
taaagaaaac 
atatactctt 
gtaggtattt 
gccaggactg 
atctcctaga 
cgtatctttt 
tttttttaca 
aggctaaata 
taatacatac 
cactctgctt 
actactgtat 



ctggtgtgat 
cagcctcccg 
tttgagtgga 
gatctgcccg 
ccgacagtca 
tgaaataata 
atttaaatga 
caaaccttga 
aatgtataaa 
tgtgtgggtt 
tgcagaaatt 
gagtgaatga 
ctttgtcact 
acagaattca 
gtgttcctgg 
aggaccaatt 
gacatttgct 
aagaccatct 
actgagtttt 
ttacattaca 
atgtatggct 
ggctcccttg 
tctgacaccc 
tctctcttca 
tctgctgaag 
ttgttttatg 
gagtccaaaa 
tatagcctac 
ttttgtaaaa 
acaaactgaa 
cactttagtt 
taatgttgaa 
ttctaataat 
cttagcctga 
tagtgaggct 
gtggctcaca 
caggagtttg 
attagctggt 
gaatagcttg 
ctccacttgg 
aaaagaaaac 
ttctgaaacc 
tataagttaa 
tcttattggt 
cacccttttt 
tcaattgcat 
ggaatcaata 
aaaggggtct 
agataagctg 
atttattaaa 
caccctcttt 
ctacatgtgg 
gatccttgat 
atgagaattg 
ttaggctgtt 
tgccttgtta 
ctcctgcttc 
tggaggaaat 
ttagggtatt 
aagatacaca 
atatttagcc 
ttaactaaca 
ctactataaa 
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ctcggctcac 
gtagcttgga 
gatggggttt 
cctcggcctc 
taggttttaa 
agatatcaga 
tgctaatatt 
aaataattac 
tttataaatg 
tgagatacaa 
ctgatgtgtt 
taaatggatg 
ttcagttttc 
ttctcttagg 
cgatctacct 
cccaactgca 
attcttccaa 
cctacgctgg 
acttccttgc 
gttccaggat 
tccttaatgg 
aaatcaatca 
gtgtcaaaaa 
tcattcttct 
gcaggcacaa 
gaaccctctt 
taattgcagt 
ggaacagaga 
ttgcagttta 
atggaaaaac 
tcttctcaaa 
attaataatg 
tctgtatgaa 
gagaaactct 
ccagataggg 
tctgtaatcc 
agaccagctt 
tgtggtggca 
aacctgggag 
gcgacaagag 
gaaaaaagaa 
tgtgaactta 
attttctgta 
tcatgtggta 
ttttttttga 
tccacgtatg 
gggtttatca 
tgatccagat 
caaagtgcag 
ggctattgca 
aagttttttt 
gtaggctgac 
attttagtgt 
ggacacctag 
tcctaaacta 
acctcaagac 
tttaacaccc 
agtgtttatt 
ctagttctca 
ttaatcaatg 
ctattaaatt 
ctatacagaa 
atgttcaagt 



tgcaagctcc 
ctacaggcac 
cacagtgtta 
ccaaagtgct 
gtgaaaatgc 
catggaggga 
aatatctgat 
caaatgtaga 
tatatatata 
aaagagacag 
tattgaaaaa 
aaacaaatgc 
aagaaataag 
actgacagac 
aatcacactg 
aacacccatg 
tgttactcca 
atgcttcaca 
ttcaatggca 
gtccaagaac 
gctctctcag 
tttctactgc 
gatggcaatg 
gtcctatctt 
agccttttct 
ctgcatgtac 
cttttatact 
tgtaatcctt 
ggcctgtgtt 
ctagtgtagt 
attaacactt 
ttttatttgt 
aatactgggc 
gaaaatccaa 
tggctaattt 
cagcactttg 
ggccaacatg 
catgcctgta 
gcggaggttg 
cgaaactcca 
aaaaattgtt 
ttgtaatcac 
agtattctca 
ataccagtta 
gttttcaagt 
ttttccagtg 
attgatattt 
cccagcaggg 
tgagaagaga 
ttacagagta 
aatggtcttt 
ggcatgacaa 
gtgcataact 
gttctcttgc 
taagcatctt 
agagttgatt 
tgaaacagta 
ttgcttcaca 
aagctactac 
tattatttag 
gcctcttatg 
aatgctacag 
ggttttccag 



gcctcccggg 
ccgccactac 
gccaggatgg 
aggattacag 
taccttaatt 
aactgaagat 
attcatttct 
attcaccaag 
agagacagaa 
aaaacacaca 
agtaattggt 
caaatctgga 
aagatgttgt 
gacccagtgc 
gcaggcaacc 
tatttcttcc 
aatatgctgc 
cagtgtcttc 
ttggatcgct 
atttgcatct 
acactgctga 
gctgatcctc 
tttgtagttg 
ttcatttttg 
acgtgtgctt 
gtaaggcctc 
tttttgagcc 
gccatacaac 
tatttgtaat 
tattatttaa 
tgaagattta 
cagagattct 
ttacttactt 
cccctttgaa 
tacattttta 
ggagccagag 
gtgaaacccc 
atcccagcta 
cagtgagtca 
tctccaaaaa 
tctacttcaa 
tgtgcaaaaa 
aaacccacac 
tggtcctcaa 
gcatggatcc 
aagtacctct 
gcatatgcat 
ggttcttgga 
tagttcatta 
aggtgttccc 
tatctacgta 
aatttatcat 
attattatca 
tgcattatta 
atgaacatgg 
ttaaaatgtt 
taaaaattaa 
caactgtctg 
agggaaaatc 
tcaaccagat 
tttgacattt 
aaaatgctta 
aatattaaac 



45000 
45060 
45120 
45180 
45240 
45300 
45360 
45420 
45480 
45540 
45600 
45660 
45720 
45780 
45840 
45900 
45960 
46020 
46080 
46140 
46200 
46260 
46320 
46380 
46440 
46500 
46560 
46620 
46680 
46740 
46800 
46860 
46920 
46980 
47040 
47100 
47160 
47220 
47280 
47340 
47400 
47460 
47520 
47580 
47640 
47700 
47760 
47820 
47880 
47940 
48000 
48060 
48120 
48180 
48240 
48300 
48360 
48420 
48480 
48540 
48600 
48660 
48720 
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agatgactaa 
gtctggttct 
tg99cagcaa 
caatggcagt 
tgggaactga 
tgggatattt 
ttcttttatc 
tggtctgata 
tgcctaaata 
gcaaccaaac 
ttttttttaa 
cccttgggta 
gaatagaaat 
ttttctttac 
tggagtctta 
cacaaaagca 
attaagccat 
taagagtgtt 
acagctaatg 
ttaaaattgt 
gcttcttagg 
ttaattaatt 
gacaaggaat 
acatgttact 
cagattttca 
atttttaatg 
aatgtagaaa 
taatcaacta 
atcttgcaaa 
ctgtacgtct 
aaatgttttg 
gaaagcttgt 
atcagcagta 
tagaaacaaa 
aaaggctaca 
gtatttttat 
cacaattgaa 
gaatatcatg 
ttgtgccctc 
gatttgtgat 
tttttttgtc 
tattttaatt 
agatactttt 
aaactgtatg 
taaagcaatt 
gcttaaggat 
gtgtctggta 
acactgttat 
acattttttt 
tcccagcaat 
cttcaaacac 
atcagagaat 
agcaaattgc 
gcaatgtatt 
aaaaataatt 
ttgattaaaa 
tgtaagatat 
acattgattg 
gtacaaaagt 
caacctaata 
ttcaatttta 
tttttatttt 
cttggcatgg 



cagtttttgt 
atccaaatct 
caatttaata 
ggcacaacag 
gtgacacagc 
ctttatctta 
ttcctctttt 
ctctatttga 
actgtatact 
ttaaactttg 
aatatgtatt 
aaagttatct 
agtgacaatt 
ttaaatgaaa 
gacacagaga 
gatttcttgg 
tcttcaacac 
actatagatt 
tgaattttcc 
cttaatatcc 
ctgcatttta 
tccaggtaca 
aagttttgtt 
ataatgcaca 
tttgatacta 
tcatatttta 
accaatgaac 
cctgaaaaga 
tcatgattcc 
ctctctaatt 
aacattttta 
aacaaacctg 
ggagctcttg 
ttctgtttta 
tataataatt 
ggagaaaaat 
taaacaagca 
aattatcctg 
acatttttaa 
gtcaccttta 
tgaattataa 
tctggtgaag 
agctatccaa 
tatagttgta 
aaaaaataag 
gaaatatttt 
agtgaggaga 
acaaatactt 
tcaacaaata 
tacccccaaa 
tgaacaaaaa 
aatatatctg 
tttaatttga 
tgccacaaac 
tgaagcacta 
tacaaaaatg 
tgtttgatat 
attaagggta 
aatcttgttt 
ttattatcat 
agaatctact 
ataatcaaga 
taccatgtaa 



aaaattgggt 
gcttttagaa 
tttacctttt 
aagcaggtga 
gtccttactt 
tgacaacatc 
ggttaggcaa 
aaactgtatt 
tccttttatt 
acacctttgt 
tatttattct 
cagaaattct 
gatgtaatca 
atgaagtagt 
gatgaaatga 
ttttaaattt 
acttcaaatg 
aaatagtggt 
ctgacagtca 
atgtgaaggg 
ttaatatatt 
gacaaatata 
tatattttgt 
aaacaaatat 
taatgtatgc 
attagcttgt 
catattggca 
gttgctctca 
atcatatcaa 
gttccttttg 
aaggtcttgc 
taccactaat 
tttatggtta 
atttgcacag 
ttctttcatg 
tatattttgt 
cttatatggg 
gatataattt 
ccaacttcct 
ttatacctta 
atatatttta 
atatttttgt 
atttttaact 
taattttgaa 
gaaaattatt 
gaatgagttg 
aaacacctaa 
atgttttcac 
tctctaaata 
agaactgaga 
aggattaatc 
tgcttcataa 
agataaaact 
aagataaatt 
aagaatcaat 
ctaaaattaa 
atctacattt 
taaagttttt 
ttttgccatt 
attaaggaga 
gaaggcataa 
caaagtaaac 
cagaaacctt 
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tatattctta 
acagtaactg 
agcactattt 
tctgtagaga 
ccttttagag 
atttaatctg 
agcattgtgt 
ttagaattgg 
ttctagttat 
aacccaaatg 
atttgtaatt 
ttggagatag 
tagtgtattg 
gagagaatta 
cttgcctaac 
tctctctttc 
ctggggggaa 
tattatgtga 
ggctaatgtg 
agaaaatttt 
ttcccaatta 
aaaaeataaa 
tatatttctt 
aaagttactc 
acctttacat 
gtgatcctaa 
tacataagaa 
gcattcacat 
tgtatcttag 
ttatgttccc 
taaacatttc 
gtttgaaaac 
tatgctactt 
attagaaatt 
ttatgatggt 
ttattgttgt 
gttattggct 
tgttgcatgt 
ccaagtgtta 
agtcaatcta 
aactaaattc 
cttcaatttt 
cagaaattat 
tgtagttgta 
ttactgtcta 
tttgttttcc 
ccttggtata 
cataccagtg 
cagtctcttt 
gggccacata 
agaaagaaaa 
cagtgggaaa 
atttaaaatt 
ggtaaaatta 
atataaatat 
cacacaaaat 
aaaagagctg 
ctgaaaataa 
aaaagtaaca 
caaaattttc 
gaaacttgtg 
acatgatgta 
tccatactgg 



agtgtcagtt 
aaaaagaaag 
gtgtatttgc 
caagagtgaa 
gcccagcatc 
tgggatagct 
tcaatttcta 
tacaaacaca 
gagaattgct 
cttgacctgt 
taatttttta 
aatagaacat 
ctaattttag 
ttatcttcat 
gtcatttggc 
taattcatac 
tattggagac 
tagtgagatg 
atcatttata 
gatatatggg 
taaaagcaca 
atttattaag 
ctcaataaat 
accatataat 
agattaaata 
tacaatatta 
atagaagcaa 
gtgttttatt 
tattttgatg 
agaagtgtaa 
tatggataaa 
aatcatttta 
tgatctctca 
gaagaatata 
gtcttacatg 
tttgtaactt 
ttcattgcaa 
tttaatgttg 
ttaagcactg 
tttctgccca 
ttatcattaa 
ctcttaggag 
aatattatga 
catacaatac 
tggtggccat 
agaaagtcag 
ttaattctat 
tatggtgaaa 
ttcttacaaa 
tgtttttcca 
taaatattta 
aatttgtagt 
aagaggcacc 
ttgaaataca 
caaagactat 
attagcaacc 
atatagattg 
atttaatatt 
gcacttattt 
ttatgctttt 
tgaagagttt 
ttaaatacaa 
aaacgtgtct 



ggttgaataa 
aaggcaattt 
ctggttttgg 
gcagaagagc 
atttcatctg 
ctttatatta 
gcaagtagag 
ctctatattt 
gtggcttgca 
ttgtaatctc 
attgtgcaag 
aaatggacat 
gattactgtt 
ttacaatatt 
cactcagtag 
tgaagtggtt 
tatgaattaa 
tatcctaaca 
tgcatgataa 
aaacctttat 
catatttata 
gacctgctat 
gtatatgact 
aatatgaata 
ttttatgaac 
atacatgtca 
tgcttacttg 
ccatttgatt 
accgacgtat 
ttgctgaatt 
tttttatcca 
ctataactct 
aaggaagttt 
tttaacaaat 
aagaatattt 
tttcttatgc 
tattttatgg 
taaattactt 
ctttcccact 
tttccactgt 
attgaacata 
attgaaaatc 
ttcaatttag 
ttatacatta 
agggagaact 
gaagatgctg 
ggatttatta 
tcatcatgta 
gaggtttttt 
ctatgagtca 
aatttaaata 
aaaaagagta 
ctacaatttg 
agaaattaat 
aagtaaaaaa 
aaaaatatga 
gcactttcat 
taataggttg 
tttttcttac 
attcagctat 
catggtgttg 
atagttcctg 
tagtttttgt 



48780 
48840 
48900 
48960 
49020 
49080 
49140 
49200 
49260 
49320 
49380 
49440 
49500 
49560 
49620 
49680 
49740 
49800 
49860 
49920 
49980 
50040 
50100 
50160 
50220 
50280 
50340 
50400 
50460 
50520 
50580 
50640 
50700 
50760 
50820 
50880 
50940 
51000 
51060 
51120 
51180 
51240 
51300 
51360 
51420 
51480 
51540 
51600 
51660 
51720 
51780 
51840 
51900 
51960 
52020 
52080 
52140 
52200 
52260 
52320 
52380 
52440 
52500 
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ggttcttagt 
tacaggcatc 
tgatcatgaa 
gccattaaag 
atgtgcaaag 
aggggaaata 
gagaaaataa 
attacaggat 
aaaaagcaag 
atgaacaact 
gggcacactc 
gtttcacttg 
tgtggtttcc 
agagaacttc 
aaagcttggc 
gcacccgatc 
cagccactag 
tgttgaacag 
agcctagagc 
tccaagccag 
gtctggtcaa 
tcagcttgca 
ctggtcagtg 
acagccagca 
ccagtcagca 
caggcagcag 
ccacctctta 
atttgtttgt 
gtgtatttgt 
tttttgagca 
taatgtacga 
ttttaatgac 
tttcatttat 
cttctttttg 
tactcaacaa 
aacccacagc 
gaacaagaca 
ccagagcaat 
tatctctctt 
agctcctaga 
aatattagca 
cccatttaca 
ggtgaaagat 
aacaaatgga 
tattgcctaa 
catagaattt 
caaagcaatc 
atactacaag 
caaaatgaaa 
aagttgatag 
actatttagc 
ttaactcaag 
aaaactgagg 
cagaagcaat 
actgcataga 
aatgataaat 
aggcaagaaa 
acactggggc 
aatgtagata 
caaacctgca 
agacataaac 
tacagtatca 
accggtcaga 



tcctacaaaa 
caccagatgg 
ggaaagtgta 
tgaggatgtt 
atatatccta 
aaacttgtta 
ggcaacaaaa 
acttatttat 
gtaagaaaaa 
tctcagagaa 
ttggagcata 
cttctttccc 
ccagtgaaga 
ccaagaggcc 
caccccagtc 
ttggtggtgg 
cattgctcac 
ttctgtctgg 
accacccaac 
caaccccact 
ttatctctcc 
gcGccaaaaa 
atctcaccta 
gccccactcg 
gccctgagtg 
tcagaggtca 
ctccctatga 
ttgtttgttt 
catctatctc 
acagctatat 
gcattccttt 
aatcattaca 
gattagtgaa 
aagcttttga 
actaggcatt 
caacatcata 
aagatgtcct 
aaagccagag 
cactgataat 
tctgataaac 
tctctatact 
atagccacac 
ctctacaagg 
aaatcatccc 
agcaatctac 
gagaaaactt 
ctaagcaaaa 
gctacagtaa 
cagtgaaccc 
taataaacaa 
catgtgcaga 
atggattaaa 
aaataccact 
tgcaataaaa 
aaataaatta 
ccaacaaaaa 
cctcccccta 
cttttgtggg 
acggattgat 
cgttctgcac 
agacacttct 
ttaatcatca 
atggctatta 



tcacaaagtg 
gatagaagct 
aagtacaggt 
cctagttaag 
tgctgtatac 
ttaatagcta 
actgtgtatt 
taatattgga 
gaagaggtcc 
aaatgcccat 
aaaactggaa 
agccaaacac 
aaagagagcc 
cattctgtct 
agctcataac 
catcatgttc 
atgtggagct 
ctgtggtacc 
cacaaagcac 
caaccgctgg 
aaactacaga 
actgcaggaa 
aatttggagc 
aatacatacc 
attttggatc 
tgtgactaga 
aatctcaggc 
tcacagtact 
ctccagcaaa 
tgagttatat 
ttctgtgtat 
aattggttga 
ttaaaccttt 
taaaatccaa 
gaagagacat 
ctaaatggga 
ctctcatcat 
aaagaaatat 
atgattttat 
aactttatta 
cccataacat 
gcacaaaaat 
agaactacaa 
atgtttatga 
agattcaaca 
gcctaaaatt 
agaacaatac 
ccaaaaaagc 
agaaataaag 
tagggaaaga 
agagtgaaac 
gacttaaata 
ccggacattg 
ataaaaattg 
acagagtaag 
tctaatatcc 
aaaccattaa 
gtgggaggct 
gggtgcagca 
atgtacccca 
caaaatgaag 
gagaaatgta 
ttaaaaagtg 
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agaactgcct 
gtctctaact 
gaaacgtttt 
aaaaaggtaa 
agtgtgatat 
tgtccatatt 
ttcagtttag 
atttaaatat 
ccaattctgg 
gtgaaaattc 
aaaaccacat 
agtgtcacac 
cacggcagac 
aaccttgtgg 
aaagttaagg 
tagctggcag 
gagttggcag 
agtcagtaag 
aacttgcagc 
gcatagcctg 
gcacatccag 
atagccagtg 
aaaggcagtg 
acagtgaatg 
ctagccagca 
gtcagctttc 
aaattattta 
tatcatcacc 
aaacattatc 
agtggctata 
ccttgcaaac 
ggttatactt 
ttttcatata 
catctcttca 
acctcaaaat 
aaaagctgaa 
tcctatttca 
aaggcatcca 
acctagaaaa 
aagtttcagg 
tcaagctgag 
aaaacaccta 
aacactactg 
attagaagaa 
ctattgccat 
tatatggagc 
tggcagcatc 
atggtactgg 
tcagacatct 
actacctatt 
tggacccata 
taaggcctaa 
gcctcagcaa 
aaaaatggga 
caatctaaag 
agaatatata 
aacacatgga 
aggtaaggga 
agccaccatg 
gatcttaaag 
catacatgtg 
tatcacaact 
aaataaataa 



gcatttaata 
aagtagaatg 
ttgtaataga 
ttcattttat 
cttgtaaaat 
taaatagtca 
tccagatctc 
gaaagtaagg 
tttccagcac 
cgtaacccat 
tagaattgtt 
tgaaagagat 
atccagcttc 
gtaataacgg 
ggtgtagctt 
ctttatgtga 
gccctttcgg 
ctggaccagc 
ccaccccact 
cagtcccatc 
taaaacttcc 
gccctatctg 
accccatcaa 
gtcttacctg 
tactcaccca 
aaattcatca 
acctctactc 
tgccatatat 
aataaagtag 
ctaatttaca 
ctctgttcat 
cattatggtc 
cctgctgaca 
taataataat 
aataagagcc 
agcattccac 
catagtactg 
aataggaaaa 
ccctaaagac 
atacaaaatc 
agctaaatca 
ggaaaacaac 
gaagaaatca 
tcagaatgct 
caagctaatg 
caaaaaagag 
aaataaccca 
tacaaagaaa 
acaactgtct 
caataaatgg 
cctgtcacca 
agctataaaa 
aaaatttatg 
tgtaattaaa 
aatgaaagaa 
agaaacataa 
cacagggagg 
taccattagg 
gcatgtgtat 
tataataaaa 
gccaaaaaat 
acaatgagat 
cagacattgg 



actggtaaaa 
tcattagcct 
aagttataca 
gtaatatatt 
attagggata 
ttgctattgt 
aggatgtata 
aaaagcaagg 
agaatgacca 
gggtgaggtg 
aagaggagca 
tccctgggcc 
cctatgttcc 
aagacttggc 
acggcaacca 
ccagagagcc 
tcaggaggca 
tgcagagcac 
caggtagaga 
tgacgacgga 
caattatgga 
gccaagagat 
tctgtggacc 
attatggagc 
cctcagagca 
agtcctggtt 
ctatgcccca 
ttgtttgtgg 
tttttaattt 
tttttgccaa 
ttttttgtct 
ttgatttgca 
atttgaatat 
aataaaaaaa 
atctatggca 
ctaataaatg 
caagttctag 
gaagtcaagc 
tacaccaaaa 
aatgtataaa 
agaatgtaat 
aaaccaagga 
tagatgatat 
aaaacagcta 
acactattta 
actgaatagt 
acttcgaact 
gacccataga 
gatctttaat 
agccaggata 
catataaaaa 
atcctggaag 
aataagtctc 
ctaaagagct 
aatgttcaaa 
acaattcaac 
ggaacatcac 
agaaatacct 
acctgtgtaa 
aataggcaaa 
atattaaaaa 
aacatctcac 
tgaggttgtg 



52560 
52620 
52680 
52740 
52800 
52860 
52920 
52980 
53040 
53100 
53160 
53220 
53280 
53340 
53400 
53460 
53520 
53580 
53640 
53700 
53760 
53820 
53880 
53940 
54000 
54060 
54120 
54180 
54240 
54300 
54360 
54420 
54480 
54540 
54600 
54660 
54720 
54780 
54840 
54900 
54960 
55020 
55080 
55140 
55200 
55260 
55320 
55380 
55440 
55500 
55560 
55620 
55680 
55740 
55800 
55860 
55920 
55980 
56040 
56100 
56160 
56220 
56280 
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gagaaaatgg 
agcagtttgg 
attactgggt 
caatcattgc 
tgatgaacgg 
aaatgaaatc 
attaatgtag 
attgaataca 
gagctgggga 
aataagacca 
ggaaaaggac 
attgaagctg 
acttaaatgc 
tagaaaaaga 
aaaagcaaaa 
aactgttaac 
atctgacaaa 
aacaacacat 
gcactttggg 
ccaacatggc 
gtgcctgtaa 
tagaggttgc 
ttctgtctca 
gaacataaac 
caatatcact 
cagtcagaat 
aaaaaggaac 
cagtatggca 
tactaggtat 
tgttcatttt 
tgacagtttg 
agaagaagaa 
acaaactaat 
aaatgattag 
ggggagggtg 
tctggtgatg 
accttcacat 
ggtactatgc 
aatttaccta 
taaaatacaa 
caatcagatt 
cctgaatatt 
ttgtcttttc 
aatcctgttt 
ctttacctag 
tttcaggtca 
atagtctttt 
ctgatttgga 
tgtatccttt 
tttcattcag 
ttgttgtttt 
tttttgaggg 
cagagatttc 
ttctggttta 
ttttgtggtt 

gggagtttga 

gatcaacctt 
gtggagtact 
aatttattag 
actgctattg 
ttgggtgctc 
gaaccctatg 
ttcttttatc 



aatccttata 
aaatttctca 
ataaacccaa 
agcactatac 
gataaagaaa 
atgtcttttg 
gaacagaaaa 
catgaaaaat 
gtgcagactg 
cacatctatg 
tccctattta 
gacccccttc 
aaaaccccaa 
aatgggcaat 
attgacaagt 
agagtaaaaa 
ggtctaatat 
ttaaaagtgg 
aggcagaggc 
aaaaccctgt 
tcccagctac 
agtgagccga 
taaataaata 
acttctcaaa 
gattattaga 
gactattatt 
atttttatac 
attcctcaaa 
atacccacag 
ggtactattc 
gataaagaaa 
gatcgtgtct 
gctgaagcag 
aacttacaaa 
ggaggaggga 
aaataatctg 
gtatccccaa 
tcactacctg 
tgtaacaaac 
caaaaaagaa 
tttaaaaatt 
agtcccttgt 
actctataga 
atctatattt 
accaatgtct 
tacatttgtc 
atatgtctga 
tcttctttct 
caaataaact 
ttttactctg 
atagttcctc 
aagtgtttag 
ggtaagttgt 
gttgtttgcc 
ttgagagatc 
catgatttta 
ccagtatgtt 
ccgcagatgt 
ttttcagacc 
tgtggctttt 
caatgtgggg 
tcattaggca 
tgacataaca 



cacttttggt 
aagaacttaa 
aggaaagtaa 
actgtagcaa 
atatgataca 
cagcaacata 
ccaaatgtgc 
aaaggagaac 
aaaaatgatc 
accatctgat 
ataaatggtg 
cctataccac 
actgtgaaaa 
gatttcatga 
aggatctaat 
gacagcctat 
tcagcatgtt 
gctaagatgc 
aggtggatta 
ttctactaaa 
tctagaggct 
gattgcacca 
aataaataaa 
agaagacata 
gaaatgcaaa 
aaaaagtaaa 
tgttagtggg 
gagctaaaag 
aattataaat 
acaatagcaa 
atgtggtgca 
tttgcaggaa 
aaaaccaagt 
cacaaagaag 
taggagcaga 
tgcaacaaac 
aactaaaata 
ggtgatggga 
ttgcacatat 
aaaagatgaa 
tatttatttt 
cagatgaatg 
ttacttcttt 
ggttttgttg 
aaaagtattt 
tttaaaacat 
ggaattgatt 
tattttcttt 
tttggtttta 
atttagttat 
taggtgagat 
cgctgtacac 
gtctctgttt 
caaaagtcat 
ttggtattga 
tttcttaatt 
ccaggtgcag 
ctattaggtc 
ttgatgacct 
taggtctttt 
tgtgtatatg 
gtgccctttt 
atagcaaccc 
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gggaatgtaa 
aacacaagta 
ataattttac 
agacatggaa 
tacacactat 
gatgcagctg 
ccatgttctt 
aagcaacact 
agaccaatgg 
cttcaacaaa 
ctgggataac 
acacaacaat 
ccctaggaga 
caaagacaca 
taaacttaag 
ggaatagaag 
taagaaactt 
tggcaacagt 
tctgaggtca 
aatacaaaaa 
gaggctggag 
ctgcactcca 
taaataaaat 
catgcagcca 
tcaaaaccac 
aaaataacag 
agtgtaaact 
cagaactacc 
cattctacca 
aggcatggaa 
tacacaccat 
catggatgga 
accgtatgtt 
gaaacaacag 
aaagataatt 
tcccatcaca 
aaagtttttt 
tcatttgtat 
acccctgaaa 
aaagagatat 
gaaattgagt 
atttgtaagt 
tgctgtgcag 
cctgtgcttt 
tcccagagtt 
tttgattaga 
gtaatgtcac 
gttaatctag 
ttggtttttg 
ttattttctt 
gttagatcat 
tttcctttta 
ttatttattt 
tcagtagcaa 
tttttatttt 
gattgacact 
aggacaagaa 
caattaatca 
aaggctgtaa 
tgtcagtcta 
tgtaggatag 
tgtcattttt 
ctgctgtctt 



attagttcat 
ccatttgacc 
caaaaccaca 
tcaacgtaga 
tgaatactat 
aaggccatta 
attaaaggta 

gggggttaat 

aacagaatag 
gctgacaaaa 
tggctagcca 
caactcaaga 
aaatgtcggc 
aaaagcaatt 
agcttcttca 
aaaatatttc 
aaatttacaa 
ggctcatgcc 
ggagttcaag 
ataactgggc 
aatcacttga 
gcctgggcga 
acataataaa 
acaattatat 
aatgagatgc 
atgctggtga 
cgttcagcca 
atttgatcca 
taaagataca 
tcaacttaaa 
gtaatactac 
tctggaggct 
cttatttata 
acactggggt 
atgggtacaa 
catgttcacc 
tttttaaagg 
accaaatatc 
ctaaaaatag 
ctataatcat 
tgttcaaatt 
attttctccc 
agatttcttt 
agaggcctta 
ttcttctagt 
tttttgtata 
ctttgtcatt 
ctagcagtct 
gatgtatttt 
ttactggctt 
taatttgaga 
acactgattt 
caaatatata 
attgtttaat 
tattccagtg 
ggctttatgg 
cgtatattct 
agtgtcaaat 
ttggggtgtt 
gaagtacttg 
ttaagtcttc 
tccagttttt 
ttatttttca 



ccactgtgga 
cagcaatcct 
tgcacttgtg 
taaccatcaa 
gaagccatga 
tcctaagtga 
tgagagaaac 
agaaggggaa 
agagcccaga 
acaggcaatg 
tatgcggaag 
tggattagag 
agtatcatcc 
gcaaaaagaa 
cagcaaaaca 
caaactatgt 
gagaaaagca 
agtaatccca 
accagcctgg 
atggtggtgc 
actcaggagg 
cagagtgaga 
agtgggcaaa 
gaaaaaaggt 
catcttacac 
ggttgcaaaa 
ctgtggaaag 
acaatcccat 
ttcatgtgac 
tgcccatcaa 
atagccataa 
attatcttta 
agtgggagct 
cttcttgagg 
ggcttaatac 
tatgtaacaa 
aaaactgttg 
agtgacattc 
aaaaaattat 
ttgcacattt 
tcctatgtaa 
attgtgtagg 
tgtttgatga 
tccaaaaact 
agatttatag 
gtagtctatg 
tctgatcgtg 
atccattttg 
ggggtctcag 
tggagttagt 
actttctaac 
tgctgcatcc 
ttttgttaat 
tcccatataa 
ttctccaaga 
cctagtatgt 
gtggttatgg 
ttaagtccag 
gaattcccct 
ttttatgaat 
ttgtttaatt 
ggtttacagc 
tttgtctagt 



56340 
56400 
56460 
56520 
56580 
56640 
56700 
56760 
56820 
56880 
56940 
57000 
57060 
57120 
57180 
57240 
57300 
57360 
57420 
57480 
57540 
57600 
57660 
57720 
57780 
57840 
57900 
57960 
58020 
58080 
58140 
58200 
58260 
58320 
58380 
58440 
58500 
58560 
58620 
58680 
58740 
58800 
58860 
58920 
58980 
59040 
59100 
59160 
59220 
59280 
59340 
59400 
59460 
59520 
59580 
59640 
59700 
59760 
59820 
59880 
59940 
60000 
60060 
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agattatttt 
atggatggat 
aaatttcttg 
ttttactttt 
ttaacattga 
tgcttgttgt 
ttaagtaatt 
taggtttttg 
gttttgaaca 
cacttttaat 
attgcctact 
tacttatgat 
atgttctttt 
atttctttca 
gaatttcctc 
aagaataact 
agtatagcat 
tattcaagct 
agtttataaa 
gctgggaggc 
ttcacatggc 
agatctcatg 
aacaccccga 
attcaatata 
tcccaaattt 
caaagtttta 
aaatgtccaa 
gttatttatt 
aggaagaaat 
ggcagtcatt 
ggccacacta 
tgcagggtgc 
cagatgcatg 
cttttcttac 
cacattttcc 
agacttctgc 
ccaagcctta 
aagcttgggg 
cctgactgga 
ctgcagacct 
tgggagtggc 
gctattaaca 
ataaaatggg 
ctctgctttc 
aagccaggct 
ttcatctctc 
tctctttgtt 
tccatctgag 
accaccactc 
agccctccaa 
ttgggtatct 
cacatcactg 
attcaccgtt 
tcatggtgga 

ggggaagtgc 

ctaggggaat 
gtcccacctc 
ccaaaccata 
tctttgtctt 
ttgaatttga 
ctgagaaatt 
tccatttgga 
ataggcctgc 



tcatcctttt 
cttatttttt 
aagacagcac 
ttttccattg 
agagtattat 
tttctaaatc 
ttctttagaa 
ttttgtgggt 
gataatgact 
atgccctaca 
tctaaactgt 
acaagtggtt 
acgagggaat 
gactgaagaa 
agcttttgtt 
ttaatgcaca 
accatttcct 
ctgtattact 
gtgaagaggt 
ttcaggaaac 
gacagaggag 
agaactctat 
tgatccattc 
aggcttaggt 
catgttcttc 
actcattcca 
aatctcatct 
tctaagatac 
tggccaaaat 
aaatcttaaa 
atgcaagggt 
agccctgttg 
gcacaagctg 
agatccatta 
ctctgcattg 
ctggatatcc 
attcatgtct 
cttgcaccct 
gttggagtgg 
gggtctggcc 
tgtaataaag 
tgtagctcct 
tttttctttt 
cttttaaata 
aattcttgaa 
tcaaattcaa 
aaagcatagc 
accacctcag 
aacaagtctc 
actgttccaa 
ataggaatgc 
taaagaacta 
ccttaggctg 
a999caaagg 
tacacacttt 
ggtgctaaac 
cagcagtagg 
tcagtctcca 
tgatttttcc 
ttggtgagca 
ttcagttatt 
tatttcaatt 
tttacttttt 



actttgagcc 
atctaacttg 
atacttgagt 
agcagtcact 
tgttaggtaa 
ctttatttct 
gtatgttttg 
accgtgaggt 
taattttgat 
cacattttga 
agttattgat 
tatttaccac 
tttatacatt 
ctctctttag 
tttctgcgaa 
taatattctt 
cttgacctgc 
ctgttctcac 
ttaattgaca 
ttacaatcat 
agagggtaaa 
cataagatag 
acctcccacc 
gggaacacag 
tcacatttca 
gcacaaactt 
gagacaaggt 
aagggagtta 
aaaggagcta 
gctccaaaat 
tgctctctca 
gctgctttca 
tgagtggagc 
cgcaggaccc 
cactagtaga 
aggcatttgc 
tctgcacacc 
ctgaagcaat 
ctaggacgca 
cccgacatca 
gtctctgaaa 
ctttatttat 
ctaccacatg 
taaattccaa 
tgctttgctg 
agttccacag 
aagagttacc 
cctggacttc 
taggaagttc 
cctctgcttg 
cccacttctc 
actgagattg 
tataggtagc 
gaaaacaagc 
taaataacag 
cattagaaat 
agacggcaaa 
ttaaatgtga 
taatttgatt 
atggtctttt 
atttcaataa 
atgcaaaagt 
tttttttcag 
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tatgtgagat 
ctactctagt 
attgattttt 
cagcgtcttt 
gagtttacta 
tgtgttctat 
attccatgct 
ttacaaaaaa 
tgcaaataaa 
cttttggtgt 
ggttttaata 
aattatagta 
ccaatgtttt 
catttcttga 
aaaaaaatct 
ccttgaaaga 
tagcttctgt 
attgctaaac 
cacagttcca 
gatggaaggt 
9999aaagtg 
cactagggga 
aggccccacc 
agtcaaacca 
aaacgcaatc 
aaatgtccaa 
ccacaggtaa 
caggcagtgg 
catatgccat 
aatctccttt 
aggcctgggg 
caggctgatt 
taccattctg 
agtgggaact 
ggctctctgt 
atacaacctc 
cacaggccag 
ggcccaagct 
ggtcatcatg 
ttttttccta 
tttcttggag 
gcaaacttct 
ttcaggctgc 
tttcagacca 
cttagaaatt 
ctctctagag 
tttactccac 
actctccata 
caaactttct 
ttacccaatt 
tggtaccaat 
agtaatttat 
atggttgaga 
acatggcaag 
atattgtgag 
cacccccatg 
tttagatgaa 
tatttttctc 
acagtgtaac 
ctacctagat 
gtatagtttc 
taatgtactt 
ctcctctgag 



agatctcttg 
atatgcctgt 
gtttgtttgt 
taattgtaga 
ctgccatttt 
cttactgtat 
atgtttttgt 
aatctatagt 
agcatcaaaa 
ttcaatgtac 
gttttgtatt 
tgacagtatt 
cctgttacat 
aaaacaggtc 
ttatttcacc 
ttttttctct 
gagaaatctg 
aaaattacct 
caggctgtac 
gaaggggaag 
ctacacactt 
atggtattaa 
tccaaaactg 
tatcattctg 
atgtcttccc 
ttcctcatgt 
gcctgtaaaa 
ataaatgctt 
gcaagtccaa 
gactccatgt 
caactccacc 
ttgagtgctt 
gtttctgaaa 
cggtgtgggg 
gaggattctg 
taaaacttag 
acactatgtg 
gtaccttggc 
ttctgaggct 
tttgccctct 
gcattttccc 
gcagccttga 
aaatttttca 
tctctttgaa 
tcttccacca 
caggggcaca 
ttcctaataa 
ttactattag 
cctcttcctg 
acaaagtcga 
tttatgtatt 
gaagaaaaga 
aggcctcaga 
aggagaaaaa 
aactctaaca 
atccaatcac 
atttgggtgg 
ttgctgattt 
ctgagatatt 
gctattgtgt 
taggcctttt 
gattgtgtcc 
aagttagttt 



aaaattgcag 
ctttataggt 
ttttcttttg 
attttgtcta 
gttatttctt 
tactctgtag 
gtatctgtca 
ataacagttc 
accaactcta 
atgtttttat 
ttattcttca 
ctgattttga 
gttagtaccc 
tggtgttgat 
ttcatgttcg 
tatcatcttg 
ctgaaagcca 
tagcctgagt 
aggaggcatg 
caagcacatc 
ttaaacaaac 
accattagaa 
9999attaca 
tcccggtccc 
aatagttccc 
ccaaatgtcc 
ctaaaaacaa 
ctgttccaaa 
aatccagaag 
ttcacatcca 
ctcatggctc 
gtggcttttc 
gatagtgtcc 
gctccaaccc 
cccctgcagc 
gcaaaagttc 
gaacttgcca 
actttttagt 
acacagagca 
gagcctgtga 
cactgtcttg 
attcctccct 
aactttcatg 
aatagagaag 
gatacctgaa 
atgccaccag 
gttctacatc 
tattttgttc 
tcttcttctg 
ttccacattt 
agtccattct 
ggtttaatcg 
aaactaacaa 
cagagcgaag 
caaggcaaca 
ttcccaccag 
agacacagag 
gaatattatt 
actgtttgta 
ttctccagaa 
tatctctcat 
cataattttc 
caaatgttct 



60120 
60180 
60240 
60300 
60360 
60420 
60480 
60540 
60600 
60660 
60720 
60780 
60840 
60900 
60960 
61020 
61080 
61140 
61200 
61260 
61320 
61380 
61440 
61500 
61560 
61620 
61680 
61740 
61800 
61860 
61920 
61980 
62040 
62100 
62160 
62220 
62280 
62340 
62400 
62460 
62520 
62580 
62640 
62700 
62760 
62820 
62880 
62940 
63000 
63060 
63120 
63180 
63240 
63300 
63360 
63420 
63480 
63540 
63600 
63660 
63720 
63780 
63840 
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atcttcaagc 
ctataatccc 
gaccagtctg 
gtttgtgatg 
ttgaacccgg 
gcaacaagag 
gaggcaggga 
tggctttgga 
ttccctgaga 
gaagcatttt 
attaaatctg 
ggttcccatc 
gtgcagtaca 
gtctaaagtc 
ccaggatcca 
ccctcccaca 
ttcaggcata 
tttcagattt 
aaaaaaaaaa 
gattaatatt 
ttattttcct 
ttctttaaat 
ttcttgaact 
aattgttcag 
ttggttttta 
ttccagtttt 
ttttttttta 
tactgcagcc 
gctgggatta 
aggttttacc 
ttggccaccc 
cttaaatgtt 
aatttctctt 
aaatactgtg 
aatcccagtc 
ggtttgtgga 
ctatgacact 
accaggctcc 
cagatgattc 
tactgatgaa 
cctccttagc 
aaagggggta 
ctgcacttct 
ctagttttat 
ctttctgact 
ccttccatgg 
agcataatta 
aagtaacact 
ccacctctca 
aaccttttta 
catttttatg 
cacaaaatgt 
tggggggcat 
gagcaaagaa 
ctcattacat 
attattcctc 
tttagataat 
atcactaatg 
agacattcag 
tgcaaatgag 
cataggttcc 
gatagcttag 
ctagactgct 



ttactgattc 
agcactttgg 
gccaacatgg 
atgggcacct 
ggggcagagt 
cgaaactctg 
gtcagcgtca 
ggtagaatgc 
acctccaaaa 
gggctttgga 
tgatagttta 
ttcaaaggct 
gttggctttc 
cagacgaacc 
ctgagtgtag 
gggctcaacc 
taattaaatt 
gtttttattt 
aagatttcta 
tttaaacatt 
ttttttttag 
atttttttaa 
tctttaagag 
ggtccattgt 
taatcttcgc 
cgcaggtgtt 
agacagagtc 
tctggctccc 
ccggcataca 
atggtggcca 
aaagtgctag 
gacttgttgt 
ctgtcattgt 
ttgggactat 
ttttaattgt 
aaatccggct 
aatcggtctg 
atgcatcagg 
aattctccta 
gattcctaag 
agacataggt 
gtgcaaccgg 
cctctgagtt 
ttttatgggg 
ttaccctcca 
agctcataca 
gtttcctaat 
catcactgct 
catcttgcta 
atcaaatatc 
ttctttattt 
ttgcattttt 
gaatcagttt 
gtaggaaagg 
gcttactata 
ctaaaaactc 
ttgacactgt 
gggattagat 
tatattagag 
cgagatatta 
agagccaccc 
ggaattgctg 
gacaagttgg 



tttcctctgt 
gaggccgaga 
tgaaacccca 
gtaatcccag 
ttgcagtgag 
tctcaaaaaa 
tagcaatggg 
gggcagcctc 
agaaagcagt 
catccagaaa 
ttatggcagc 
tcccctctgc 
gagcagggct 
ttgaacaccc 
agcagagcaa 
agctaacctc 
tgcataaatt 
cctgagtttg 
atgagttttt 
gtttttattt 
ttttctgttt 
attccttatt 
gcttatgctg 
tgagcctctg 
atatttatgt 
ctttgttggt 
ttgctctgtt 
aggttcttcc 
gcactacgcc 
ggctggtctc 
gattacaggt 
tgcttctgct 
ttcctagtgt 
attgctgtgc 
tttggggcca 
aagaattctt 
ttaaatgtgg 
tactgcattc 
gcactcccaa 
ctggagggga 
ctagggaaat 
aaatgaccgt 
ctgttgtatt 
acagtgatgc 
aaagtgtaat 
acaaacaatg 
cccagagtcc 
attccttagt 
taaatatgac 
aaggaaatga 
ttggttaaag 
attattaaaa 
ctcctctgaa 
aggttagcta 
tgcaaatcat 
catttctcat 
ggtggtggta 
tcataaaatt 
taggcccagt 
accaagttca 
agaacttttg 
cttttgggta 
ccattagtgg 
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catcaaaact 
agggtggatc 
tctctactaa 
ctacttgtga 
ccaagatcgt 
agaaaaagga 
tgagaaagat 
caaaacctgg 
cctgccgaca 
tgcaagacaa 
aacagaaaaa 
ttgccgtcca 
tggattgaaa 
cttaattgcc 
cattcctcaa 
tttccaccct 
caaagagcat 
ctaaggaaat 
aagttcagtt 
ctctttccaa 
tcttataatt 
tcctcagaga 
aattcattgt 
ttgatttctt 
cgatgcctgt 
cttagatctt 
acccatgaat 
agcgattctc 
tggctaattt 
gaactcctgg 
9tgagccact 
ggggtaggct 
ggggaagtct 
tgtcattgtt 
ggctggggga 
gctgtactgc 
catctctact 
agtgtcctat 
aggtttttat 
gactgaacat 
tctttgtgag 
ttcttttaca 
tacaatggag 
taggggattt 
cctcaaaatt 
atacatttga 
tttaattaaa 
ccctctgaga 
tgcattttaa 
gtatatatca 
gggaaatata 
atcatttttt 
aaattctcag 
tgaaaagtgt 
ttttctaagc 
tttatagctg 
cgatttgtta 
gttacttttt 
atcatgtaat 
attagaaagt 
tttgcttggg 
tcactggtca 
acaggtatat 



atgggcacgg 
atctgaggtc 
aaaaatacaa 
ggctgaggca 
gccatcgcag 
tagaggaggt 
gaaagatgtg 
aaaaagcaag 
ccttgattct 
aatttgtgtt 
gaatatacct 
gtccacagct 
attcagcatg 
cctcccgtcg 
ggtgtcctca 
ttgaattgtc 
atgctgcaat 
aataaaaaca 
tttgtattct 
tttcttattt 
tcttttttta 
tctttgttct 
cagacattta 
ttggtggtgt 
gcatttgaag 
tagttcttag 
gcagtggtgt 
ctgcctcagc 
ttgtattttt 
cctcaggtga 
gtgcctggcc 
tatagtgacc 
tattttgtgc 
tccaattccg 
ggcttcatga 
caactgtgct 
aattacagtg 
tctttgttct 
gagagaggac 
ccaattccat 
tggcattata 
ctccacaact 
ctcccatctt 
tctattccat 
tcaagatttt 
aattatgaca 
cacaagaggg 
tttgctttaa 
agcttagaga 
gaaccagaat 
ttaatgtatt 
tcttttactt 
tgccttctct 
taactccatt 
actttaaatg 
ggaacactaa 
ctgatgaagc 
gaatactaaa 
ctaatttatt 
tagttggtgg 
agaatctctg 
tttgcatctg 
tgagagttga 



tggctcacgc 
aggagtttga 
agaattagct 
ggagaatccc 
tctagcctgg 
tgggaagagg 
tggccattgc 
aaagtagatt 
agcccagtaa 
gttttaaggg 
gtcaaaactg 
ccacttgcag 
taaggaagaa 
agagcaggga 
cccccaggca 
tctctctccc 
cccattgtaa 
aaacctcctt 
ttatttctgg 
tcttcctgaa 
acatttttct 
gacctgtaat 
atagatcttt 
catatttccc 
agagagacac 
tattcttttt 
gatcttgggt 
ctcccaagta 
agtagagatg 
tccacccgcc 
agtattcact 
actgcaacta 
actgaagctt 
gaaagttatt 
aagaaccttg 
tcctgtattt 
ctgagtagcc 
cagttcacct 
cagagtggat 
ttcccccctt 
ctagcttggg 
ttccttgatt 
tgaatagctt 
catctttctt 
tttttgtatg 
atatgactga 
atgattccag 
gagcaggtct 
agcacctaaa 
taagtccttg 
aactgaatct 
tgcatatcat 
cagaggctgt 
tgtaacaata 
agtaaactat 
cgtaagacag 
tagaatcaaa 
atatttactc 
ttgcaaaaat 
cagagctgag 
agcacacttg 
atcataggta 
tttaagtgtt 



63900 
63960 
64020 
64080 
64140 
64200 
64260 
64320 
64380 
64440 
64500 
64560 
64620 
64680 
64740 
64800 
64860 
64920 
64980 
65040 
65100 
65160 
65220 
65280 
65340 
65400 
65460 
65520 
65580 
65640 
65700 
65760 
65820 
65880 
65940 
66000 
66060 
66120 
66180 
66240 
66300 
66360 
66420 
66480 
66540 
66600 
66660 
66720 
66780 
66840 
66900 
66960 
67020 
67080 
67140 
67200 
67260 
67320 
67380 
67440 
67500 
67560 
67620 
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gcatctgtca 
aatccatctg 
ttccacttag 
ttagaacatt 
cctcatacat 
tcttcttcct 
aaaactacat 
atattttcaa 
tgcttctact 
aaaatcgtat 
ctaaaaatgc 
ccctaacaac 
tatcactgat 
taggaagagc 
catttccatc 
cagtgggtgc 
gcgcaagggg 
gaaaatcagg 
gccagattat 
ctagcacagc 
gccattgccc 
ccaccacagc 
agacaaacaa 
gaagagagca 
gacccctgac 
ctcacacggc 
actaacaaac 
accaaaagta 
taaaaagcag 
aagctggacg 
caagctacgg 
agaagaatgt 
gaaaaccaag 
ctggaagaaa 
tttagagaaa 
aaaagaccaa 
ttggaaaaca 
aacatccaga 
ccaagacaca 
gccagagaga 
tcggcagaaa 
aagaattttc 
ataaaatact 
aaagagctcc 
aatcatacca 
aaataaccag 
aaagtaaatg 
gtcaagaccc 
ggctcaaaat 
gggttgcaat 
aagaaggcca 
atatatatgc 
aaagagactt 
acattagaca 
ctgcaccaag 
atattttttt 
cctctactca 
gcaatcaaac 
ctgaacaacc 
atgttctttg 
aaagcagtgt 
tccaaaattg 
tcaaaagcta 



aggttaacac 
tgtactttac 
ctcctagcac 
tccaatacca 
tctcctttcg 
ctctagtttc 
tagcaaatat 
atattttaag 
cttcatttta 
tttatcctgc 
atatatgtaa 
agactcacta 
gaaataaaaa 
tcgggtctac 
tgaggtactg 
gcacaccctg 
tcagggagtt 
tcactcccac 
atcccacacc 
agtctgagat 
aggcttgctt 
tcaaggaggc 
aaagacagca 
atggttctcc 
ccctgagcag 
cgggtactcc 
agaaaggaca 
gataaaacca 
agcgcctctc 
gagaatgact 
gaggacattc 
ataactagaa 
ctcgagaact 
gggtatcagc 
aaagaataaa 
atctacgtct 
ctctgcagga 
ttcaggaaat 
taattgtcag 
aaggtcgggt 
ctctacaagc 
aacccagaat 
ttacagacaa 
tgaaggaagc 
aaatgtaaag 
ctaacatcat 
gactaaacac 
atcagtgtgc 
aaaaggatgg 
cctagtctct 
ttacataatg 
acccaataca 
agactcccac 
gatcaacgag 
cgtacctaat 
cagcaccaca 
gcaaatgtaa 
tagaactcag 
tgctcctgaa 
aaaccaacga 
gtagagggaa 
acaccctaac 
gcagaaggca 



gaaacattct 
tttccttgca 
acatcatttg 
ttaaagaaga 
ctgaaagtaa 
caacagcata 
tttcctttgc 
tcaccttgaa 
aaaattttgc 
agtcctctag 
tcttctattt 
aaattcgctg 
acaaaacaag 
agctcccagc 
ggttcttctc 
cgcgagcgga 
ccctttccta 
cccaatactg 
tggcttggag 
caaactgcaa 
aggtaaacaa 
ctgcctgcct 
gtaacctctg 
cagcacacag 
cctaactggg 
aacagacctg 
tccacaccga 
caaagatagg 
ctcctccaaa 
ttgacgagct 
aaaccaaagg 
taaccaatag 
acatgaagaa 
aatggaagat 
aagaaatgag 
gattggtgta 
tattatccaa 
acagagaacg 
attcaccaaa 
taccctcaaa 
cagaagatag 
ttcatatcca 
gcaaatgctg 
gctaaacatg 
accatcgaga 
aatgacagga 
tccaattaaa 
tgtattcagg 
aggaagatct 
gataaaacag 
gtaaagggat 
ggagcaccca 
acattaataa 
acagaaagtc 
agacatctac 
ccacacctac 
aagaacagaa 
gattaagaat 
tgactgctgg 
gaacaaagac 
atttatagca 
atcacaatta 
agaaataact 
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gttggccctg 
aataaatgat 
tatttcaatc 
aggaaggcaa 
tatcattttg 
aagctttgta 
atctccttat 
ggttgcttct 
ttgtttccac 
gacaaattta 
ctagtagcag 
tgatattttt 
aaagacaggg 
gtgagcaaca 
actagggagt 
agcagggcga 
gtcaaagaaa 
cacttttctg 
ggtcctacac 
ggtggcagcg 
agcagccctg 
ctgtaggctc 
cagacttaaa 
ctggagatct 
aggcaacccc 
cagctgaggg 
aaacccatct 
gaaaaaacag 
ggaacgcaat 
gagagaaggc 
taaagaactt 
agagaagtgc 
tgcagaagcc 
gaattgaatg 
caaagcctcc 
cctgaaagtg 
gagaacttcc 
ccacaaagac 
gttgaaatga 
gggaagccca 

tgggggccaa 
gccaaactaa 
agagattttg 
gaaaggaaca 
ctaggaagaa 
tcaaattcac 
agacacagac 
agacccatct 
accaagcaaa 
actttaaaac 
caattcaaca 
gattcataaa 
tgggagactt 
aacaaggata 
agaactctcc 
tccaaaactg 
attataacaa 
ctaactcaaa 
gtacataacg 
acaacatacc 
ctaaatgcct 
aaagaactag 
aaaatcagag 



agtctcacat 
ctagtgaatt 
atttattttt 
tgaaaaggag 
aaaacagact 
tgtctctgag 
ttcaagtaaa 
taaaaaataa 
ctttacaaac 
actttggggt 
agtttgaact 
caaatagtat 
gaggagccaa 
cagaagacag 
gccagacagt 
ggcattgcct 
ggggtgacag 
acgggattaa 
ccacggagtc 
aggctagggg 
aagctggaac 
cacctctggg 
tgtccctgtc 
gcctcctcaa 
cagcgggggc 
tcctgtctgt 
gtacatcacc 
agcagaaaaa 
tcctcaccag 
ttcagacgat 
gaaaactttg 
ttaaaggagc 
tcaggagccg 
aaatgaagtg 
aagaaatatg 
atggggagaa 
ccaatctagc 
actcctccag 
aggaaaaaat 
tcagactaac 
cattcaacat 
gcttcataag 
tcaccaccag 
accggtacca 
actgcatcaa 
acataacaat 
tggcaaattg 
cacgtgcaga 
tggaaaacaa 
aacaaagatc 
agaagagcta 
gcaagtcctg 
taacaaacac 
cccaggaatt 
accccaaatc 
accacatact 
actatctctc 
accactcaac 
aaatgaaggc 
agaatctctg 
acaagagaaa 
aaaagcaaga 
cagaactgaa 



ttcaatctag 
tatatatttt 
cttcaccata 
acggagatag 
acgggtccca 
gacaagatag 
tgagggactt 
ctgtttatta 
aaaaatggac 
actctgtttc 
caattaattt 
tttatggaat 
gatggccgaa 
gtgatttctg 
gggcacaggt 
cactcaggaa 
atggcacctg 
aaaacggcgt 
tcactgattg 
aggggcgccc 
tgggtggagc 
ggcagggcac 
tgacagcttt 
gtgggtccct 
agactgacac 
tagaaggaaa 
atcatcaaag 
ctggaaaccc 
caatggaaca 
caaactactc 
aaaaaaattt 
tgatggagct 
atgcgatcaa 
agaagggaag 
ggactatgtg 
tggaaccaag 
aaggcaggcc 
aagagcaact 
gttaagggca 
agcggatctc 
tcttaaagaa 
tgaaggagaa 
gcctgcccta 
gccactgaaa 
ctgacgagca 
attaacttta 
gataccaaga 
gacacacata 
aaaaaggcag 
aaaagagaca 
actatcctaa 
agtgacctac 
cccactgtca 
gaactcagct 
aacagaatat 
tggaagtaaa 
agaccacagt 
tatgtagaaa 
agaaataaag 
ggacacattc 
gcaggaaaga 
gcaaacacat 
ggaaatagag 



67680 
67740 
67800 
67860 
67920 
67980 
68040 
68100 
68160 
68220 
68280 
68340 
68400 
68460 
68520 
68580 
68640 
68700 
68760 
68820 
68880 
68940 
69000 
69060 
69120 
69180 
69240 
69300 
69360 
69420 
69480 
69540 
69600 
69660 
69720 
69780 
69840 
69900 
69960 
70020 
70080 
70140 
70200 
70260 
70320 
70380 
70440 
70500 
70560 
70620 
70680 
70740 
70800 
70860 
70920 
70980 
71040 
71100 
71160 
71220 
71280 
71340 
71400 
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acacaaaaaa 
aaaattgata 
gtgataaaaa 
agagaatact 
ttcctggaca 
caataacagg 
gaccagatgg 
ttctgaaact 
tcagcatcat 
aatatccctg 
gcagcacatc 
ctggttcaat 
aaaccacatg 
catgctaaaa 
tatctatgac 
tttgaaatct 
ggaagttctg 
agaggaagtc 
catctcagcc 
aatcaatgta 
atcatgagtg 
cttacaaggg 
aaagaggata 
gtgaaaatgg 
ccaatgactt 
agagcccgca 
cctgacttca 
aacagagata 
aactatctga 
aataaatggt 
cttacacctt 
accataaaaa 
gactttatgt 
ctagttaaac 
cctacaaaat 
atctacaatg 
gaaggacatg 
aaaatgctca 
tctcacacca 
atgtggagaa 
gtggaagtca 
catcccatta 
cacaagtatg 
tccaacaatg 
agccataaaa 
attctcagta 
gggaattgaa 
tgtggggtgg 
gttagtgggt 
gtgcacatgt 
caagaacaca 
tatgctcatc 
ctaagttttt 
ctctacctct 
ccccatataa 
ataatgtcct 
gaataatatt 
cacttgggtt 
agatgcttta 
atgaggacta 
ttagctgatc 
tttgctttac 
aaatagatgc 



cccttcaaaa 
aaccagtagc 
atgataaagg 
acaaacacct 
catacactct 
atctgaaatt 
attcacagcc 
attccaacca 
cctgatacca 
atgaacattg 
aaaaagcttg 
atatgcaaat 
attatctcaa 
actctcaata 
aaacccacag 
ggcacaagac 
gccagggcaa 
aaattgtccc 
ccaaatctcc 
caaaatcaca 
aactcccatt 
acgtgaagga 
caaaccaatg 
ccatactgcc 
tcttcacaga 
tcgccaagtc 
atctatacta 
tagatgaatg 
tctttgacaa 
gcagggaaaa 
atacaaaaat 
ctctagaaga 
ctaaaacacc 
taaagagctt 
gggagaaaat 
aacttaaaca 
aacagacact 
ccatcactgg 
gttagaatgg 
ataggaacac 
gtgtggtgat 
ctgggtatat 
tttattgtgg 
ttagactgga 
atgatgagtt 
aactatcgca 
caatgagaac 
gggagggggg 

gcagcacacc 
accctaaaac 
cacacacaca 
acacacaaat 
catgtacttt 
ggtaaccact 
gtgagatcat 
ccaggcttgt 
ctattgtaaa 
gtttcaatac 
agaggtggtg 
ggggtaataa 
ttgcacacag 
tatagtaaga 
attaaaattt 



aattaatgaa 
aagactaata 
ggatatcacc 
ctatgcaaat 
cccaaactaa 
gtggcaataa 
gaattctacc 
atagaaaaag 
aagccgggca 
atgcaaaaat 
tccaccatga 
caataaatgt 
tagatgcaga 
aattaggtat 
ccaatatcat 
agggatgtcc 
ttaggcagga 
tgtttgcaga 
ttaagctgat 
agtattctta 
cacaattgct 
cctcttcaag 
gaagaacatt 
caaggtaatt 
attgcaaaaa 
aatcctaagc 
caaggctaca 
gaacagaaca 
acctgaggaa 
ctggctagcc 
caattcaaga 
aaacctaggc 
aaaagcaatg 
ctgcacagca 
tttcacaacc 
aatgtacaag 
tctcaaaaga 
ccatcagaga 
caatcattaa 
ttttacactg 
tcctcaggga 
acccaaagga 
cattattcac 
ttcagaaaat 
catgtccttt 
agaacaaaaa 
actggacaca 
agggatagca 
agcatggcac 
ttaaagtata 
cacacaaaaa 
gcaaggtcat 
gtatcttcta 
gttttatttt 
tcagtatttg 
acgtgttgta 
tatataccac 
cttagctgtg 
aattcacagg 
aattataccg 
acacacaaaa 
ttttactcta 
attttttaaa 
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tccaggagct 
aagaaaataa 
accgatccca 
agactagaaa 
accaggaaga 
tcaatagctt 
agaggtacaa 
agggaatcct 
gagacacaac 
cctcagtaaa 
tcaagtgggc 
aatccagcat 
aaaggccttt 
tgatgggacg 
actgaatggg 
tctctcacca 
gaaggaaata 
cgacatgatt 
aagcaacttc 
tacaccaaca 
tcaaagagaa 
gagaactaca 
ccatgctcat 
tacagattca 
actactttaa 
caaaagaaca 
gtaaccaaaa 
gagccctcag 
aacaagcaat 
atatgtagaa 
tggattaaag 
tttacctttc 
gcaacaaaag 
aaagaaacta 
tactcatctg 
aaaatcaaac 
agacatttat 
aatgcaaatc 
aaagtcagga 
ttggtgggac 
tctagaacta 
ctataaatca 
aatagcaaag 
gtggcacata 
gtagggacat 
accaaacact 
ggaaggggaa 
ttgggagata 
atgtatacat 
ataataataa 
aaaaaaaaaa 
ttttaggcat 
acctgcatac 
tatcactgta 
tctttctgtg 
tcaaataaca 
catttcttta 
atgaataata 
ataaacaagt 
tatttgggat 
agggtaacta 
tgtatcacat 
agggttcatt 



gtttttttga 
gagagaagaa 
cagaaataca 
atctagaaga 
agttgaatct 
accaaccaaa 

ggaggagctg 

ccctaactca 
caaaaaagga 
atactggcaa 
ttcatccctg 
ataaacagaa 
gacaaaattc 
tatttcaaaa 
caaaaactgg 
ctcctattca 
aatggtattc 
gtatatctag 
agcaaagtct 
acagacaaac 
taaaatacct 
aaccactgct 
gggtaggaag 
atgccatccc 
agttcatata 
aagctggagg 
cagcatggta 
aaatatcgcc 
ggggaaagga 
agctgaaact 
acttaaacgt 
aggacatagg 
ccaaaattga 
ccatcagagt 
acaaagggct 
aaccccatca 
gcagccaaaa 
aaaaccacaa 
aaaaacaggt 
tgtaaactag 
gaaataccat 
tgctgctata 
acttggaacc 
tacaccatgg 
ggatgaaatt 
gcataccctc 
catcacactc 
aacctaatgc 
atgttacaaa 
agaaatgggt 
aaaacaagaa 
tagatggggc 
ctctatttcc 
tatttaattt 
tctggtttat 
cgatcttcct 
tctatttgtc 
ctgcaaaaac 
ctagagatct 
tcgtgctaaa 
tatgttaatg 
gacattatgt 
aactgcggtt 



aaagatcaac 
tcaaataggc 
aactaccatc 
aatggataaa 
ctgactagac 
aagagtccag 
gtaccattcc 
ttttatgagg 
attttagacc 
accgaatcca 
ggatgcaagg 
ccaaagacaa 
aacaacactt 
taataagagc 
aagcattccc 
acatagtgtt 
aattaggaaa 
aaaaccccat 
caggatacaa 
agagagccaa 
aggaatccaa 
caatgaaata 
aatcaatatc 
catcaagcta 
gaaccaaaaa 
catcatggta 
ctggtagcaa 
gcatatctac 
ttccctattt 
ggatcccttc 
tagacctaaa 
catgggcaag 
caaatgggat 
gaacaggcaa 
aatatccaga 
aaaagtgggc 
aacacatgaa 
tgagatacca 
gctggagagg 
ttcaaccatt 
ttgacccagc 
aagacacatg 
aacccaaatg 
aatactatgc 
ggaaatcatc 
actcataggt 
tggggagtgt 
tagatgacga 
cctgcatgtt 
cttctctacc 
agacaaacag 
ttaatttaat 
tcttctagcc 
tttaaatatt 
ttcacttaga 
tttcagggat 
cttcatgagg 
atggaagtac 
aatgaacaac 
tgaatatatc 
tttgtgttaa 
tgtgaacctt 
caggcaatct 



71460 
71520 
71580 
71640 
71700 
71760 
71820 
71880 
71940 
72000 
72060 
72120 
72180 
72240 
72300 
72360 
72420 
72480 
72540 
72600 
72660 
72720 
72780 
72840 
72900 
72960 
73020 
73080 
73140 
73200 
73260 
73320 
73380 
73440 
73500 
73560 
73620 
73680 
73740 
73800 
73860 
73920 
73980 
74040 
74100 
74160 
74220 
74280 
74340 
74400 
74460 
74520 
74580 
74640 
74700 
74760 
74820 
74880 
74940 
75000 
75060 
75120 
75180 
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ataaaccttt 
aggggttcat 
gaattcttcc 
ttattcactt 
gattacggcc 
ttttaattgt 
atggcttgtg 
gtgcttagaa 
acataatttt 
ccttatcctc 
ctcaaatccc 
ggctccaacc 
gtgtgtgtta 
tcagttagac 
ccagtgtatc 
gagtggtgga 
aattggtctt 
acgaactccc 
gtctgctgtt 
aggcctcggg 
aatgcaacat 
gtctgtgggt 
tgctcctgta 
tacaacttgt 
gcagaacctt 
caccttttat 
cacatcttag 
attgtgtatt 
ctcacaggca 
agtccaattt 
ctcatttttc 
ataggaactt 
gtgaaatatc 
gtgtgtgtgt 
agaaccatta 
ctgagacagg 
gacacagctt 
ttatacattt 
gtgcagaaca 
taaatttaaa 
agtctgtgtt 
taaccgcctg 
ctttatttat 
gttgttgcca 
ctggagtaca 
cttctgcctc 
tttttgtatt 
tgacctcagg 
atcctgcctg 
aactgtattt 
ataatagctc 
gtttgttcgc 
cttgtttcat 
ttattatttc 
attactgaag 
atagcttcta 
gtcacatagt 
atggtgcact 
atatctcaag 
ggtgatatgt 
tcttacctca 
cctttttgta 
tctaaaactt 



tacattttgg 
ttgaacaagc 
taagtagttt 
gtgattgaat 
tagtgcattc 
tgagagttgg 
tgttccaggt 
gaccatccca 
tttgtacaag 
ctcgcatggc 
taggggtgga 
ccacagcagc 
cagtgtgatc 
cctctgcatt 
ggaaaaattg 
ggtggatctc 
cccctggagt 
ctcggtgtct 
gtgctcctct 
tttttttggg 
ttgggtgcaa 
ggagccctca 
tcaatattgg 
aactttttcc 
cccaacatac 
tctgtcctga 
ctgtagtctt 
tctggatccc 
gttattaagc 
cctactgtgt 
tttttgtaac 
gaaatgagta 
tgtagtttag 
gtttgtgtgt 
catctggtat 
tctcagttag 
ccagaggtcc 
tagagagaca 
actcaaagtc 
cattttctgg 
atgttaatgc 
aactagtctt 
ttttggttta 
taaactttac 
gtggctgatc 
agcctccgaa 
tttagtagag 
tgacccacct 
gccaaacatt 
cctttatttt 
tctctgtcca 
ttctgcatct 
gaatatctat 
tatttacaca 
taactctgta 
cttgatattt 
agaacatgac 
atgtaagaat 
gccttccagg 
attgcagcag 
gattctcaga 
cgttgctttt 
acagtaaaat 



aggtgagatt 
aaacaaacga 
tcttgactgg 
agcctacaaa 
tcacctggaa 
ggactttcct 
atggttgggt 
agcttggttt 
gattcttgca 
aagtgacagg 
atgcagacag 
gtctagggtt 
tttcagcttt 
accacaaggg 
gatcacatgt 
aatgagatgg 
cgggcagctc 
gcatcgttcc 
gctcctgtca 
cacaggatgg 
aaacaggaga 
ccagggaccc 
attctgtata 
tatccttgac 
tgacctagca 
cacaggggtt 
tttacctctc 
ataaaagcca 
cacatgcaac 
ctttttmaat 
cactgaaact 
acacttaacg 
ggagcttagc 
gcacatacac 
ccttgtattt 
tttagaaagt 
tgaggacatg 
taatctatca 
caggttgggg 
ttgacaattg 
tggagaggaa 
tcaagttaaa 
cacttcttat 
tttttttttt 
tccgctcact 
gtagctgaga 
atggggtttc 
gcctcagcct 
acttttttaa 
tctccacaac 
ctagaatggg 
atagtgccca 
tttgatgccc 
tgggaaaaca 
atccagggct 
cttagatata 
taagtgtaag 
gccatgacta 
tctatacata 
cagctaataa 
ttgtatgcac 
ctcaatagta 
aaatgaggtg 
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ttcagtatct 
agaaacaagc 
tttcttttct 
attgatctag 
taaaagcctt 
tctcattctg 
catggtcatg 
aatgctttga 
gttttatttc 
agtgtggctt 
tcaggtgtac 
aagtgtttac 
gctgtcttca 
cagcaggctt 
ggacatggag 
atggggagcc 
atttgccgat 
accatcactg 
acatccagcc 
ggcctgtggt 
gcctgttctc 
cacccttctc 
ctatgtatct 
ctactcctct 
agcccaggtg 
agacctactg 
tccacttatg 
atgagcctct 
aaataagtta 
ctgaaattcc 
gtactctggt 
tgagaaagat 
tggattttct 
acacaaattt 
gctcgttatc 
ttattttgcc 
tgttgaaggt 
atcaatacat 
cagggtggcg 
gtttgtctaa 
taatgaggca 
ttttacaagc 
gatatccttc 
tttttggtag 
gcaacctctc 
ttacaggcac 
tccgtgttgg 
cccaaagtgc 
tgaggtcttc 
agttatttga 
aaaaccaaaa 
atcacacaaa 
aaaatgacat 
cagtgattca 
cctaaccagc 
tcacatacag 
gagaatgtgt 
tctgtgtccc 
caattttatg 
aggaagtttt 
tagtagtgtg 
acagcaagac 
actcattttg 



gcatttatct 
ttaagaatct 
ttttctgaaa 
acaattaggc 
cctttggagc 
tattttcagt 
cagtctattc 
tctcatggtc 
gtattgaaat 
gcttctttgg 
aggtcgtggg 
agctcctgaa 
ggcggcttgc 
tctgtattcc 
gatgagtgca 
agaaagggga 
gatgctctga 
gtctgctggc 
acttgtgtcc 
gggccagagt 
acttaggtcc 
tacacatcac 
gaggtcccag 
gctcgttctt 
ttccttattc 
tggttttcct 
agcccttctt 
cagtcaaaac 
ttctgcaaat 
cacatcctga 
gaaggccact 
ttgaagtgaa 
gcactttgca 
cactcagtga 
cttctgtaga 
aaggttgagg 
ggtcatggta 
gtaagattta 
tgtcgtttcc 
agacctgaga 
tgtccgaccc 
cctggctgag 
ttagaagttt 
aaccttattc 
catcctgggc 
gtgccaccac 
ccaggctggt 
tcgaactaca 
cttgactttc 
atattattta 
aaagcagaaa 
atattgagta 
gagaagagga 
ttcagttgca 
tcctaacatc 
gtactcaagg 
ctggcacaat 
cagaagttcc 
aaacagttta 
taaaatacat 
acctcggttc 
agagttttga 
agatgagaca 



tggattttca 
ttcttacaat 
cagtcttttc 
tcttgaaatt 
atcttctatt 
gtcagttttt 
acaaggcctt 
ctgaaatttt 
aggagagttt 
tgccccactg 
agtgtttttg 
gccccaatgg 
attaatcagc 
aagttcttgc 
aggttttatt 
cggagtggga 
ccacccccag 
atctgctggt 
atgcccacta 
ggtcttggaa 
ttgggcaaag 
ttcccaaccc 
ttctaggttt 
cctctatcat 
aagaggtgtt 
tctttgtaca 
ctctactgca 
tgtttcagat 
gaactcctct 
tatattttct 
gatactcttg 
taagagtaaa 
ttaagtgtgt 
tagtttccaa 
cccaggaaat 
acacacctgt 
cagcttgctt 
ccttggtttg 
aggctgtagg 
tcaatagaaa 
ccatttctct 
gaggaagtcc 
tcttgctaca 
tgtcgccagg 
ttgagcaatt 
acctgactga 
cttgaactcc 
ggcatgagcc 
ctttccccaa 
tttacttgaa 
tttttgaaag 
tcagtaaata 
aagataaata 
aagcttttaa 
aagatatgtc 
gcacaagtca 
cctgtttatc 
ttatcatcat 
gctttagaaa 
aattttcagt 
aattatttaa 
ttataggcat 
agaattgaga 



75240 
75300 
75360 
75420 
75480 
75540 
75600 
75660 
75720 
75780 
75840 
75900 
75960 
76020 
76080 
76140 
76200 
76260 
76320 
76380 
76440 
76500 
76560 
76620 
76680 
76740 
76800 
76860 
76920 
76980 
77040 
77100 
77160 
77220 
77280 
77340 
77400 
77460 
77520 
77580 
77640 
77700 
77760 
77820 
77880 
77940 
78000 
78060 
78120 
78180 
78240 
78300 
78360 
78420 
78480 
78540 
78600 
78660 
78720 
78780 
78840 
78900 
78960 
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agcctgtaat 
aaaggaaaat 
ttgtaatacc 
tattaattta 
atatgaaaat 
gccattctga 
ataatcatta 
tatgggctaa 
aattctcctg 
ttgcaagtca 
aatatgtgtt 
attatgtgat 
gagaaatata 
aatggatgag 
gaattaggtt 
tagatagata 
acaggtgtag 
cctaggcaaa 
taccaaaacc 
atgtggagaa 
ctttatttcc 
aatcacagaa 
tgtgctgttt 
aatgagactg 
tgtggatttg 
gaagaccatt 
cactgagttt 
tctgcgctac 
tgtctatggc 
tagatccaat 
ttctgatact 
ctccctcacc 
atcagcagag 
cctgttttat 
ggaatctaaa 
ctacagtctg 
tgtcatgacc 
acatgtccta 
tcagctaaaa 
atattccatg 
agtctgcacc 
gataagcaca 
actgttgtgc 
cttgaatgta 
ccattcttct 
agaccagaaa 
gaagagtggc 
tatggatcca 
ggaggtcact 
ggactaaaat 
atccaccttg 
tttttgtttt 
acaggaagag 
ctctttagac 
aagatagagt 
ttcctccaac 
acaggtgtgt 
ctatattgcc 
ccaaagtgct 
aaactttcta 
aaattctgcc 
ttgtacatta 
tttgttgtta 



aaaaatttca 
gtgaattagc 
tcttgttaga 
ttttctttct 
ttattttatt 
ctgtgcattt 
gagatgaaga 
taaacaagct 
agaacatgac 
ttcattctta 
gagaagtata 
gagaaagtat 
gaatactctg 
aaatgcttca 
ttaaaaagaa 
gatagataga 
gtatagatat 
tgatataatg 
aaagtttatg 
aaactaattc 
atagcttgaa 
ttcattttac 
ctggttgttt 
gactctcgcc 
tgctatacat 
tcctttgctg 
tacatgctgg 
agtgtgaaaa 
ttctcagatg 
gtcatcaacc 
tatgtcaaag 
atcgtcttgg 
ggaaggcaca 
gggactctct 
ataatagctg 
aggaataaag 
atggtgatgc 
gcgttctgat 
agctcatgct 
aatcagcagc 
tcaggtgcac 
ctgaattttg 
tctttttgtt 
ttctattctt 
gcactccaag 
ctaggggcca 
aatggaaaaa 
gctctctgag 
ttctgtttcg 
taaaatggga 
ctttgaatat 
ccttagcact 
aagacattgc 
attatgctaa 
ttcactctgt 
tcctaggctc 
gctaccacac 
caggcaggtc 
gggattatag 
cttattgcct 
attacgccat 
agtaactgta 
gttttacaag 



agtatgaaag 
aaagattctg 
tacaagttgt 
aaacacatta 
tcacagttct 
tgagctatga 
ttccagtacc 
tcccaagaat 
atcacgcaag 
catgctctcc 
attaaaaaat 
atactttata 
gaaataaaga 
caaagaagtg 
caggaagata 
tagatagata 
atctataggt 
acattgtaga 
gtagtaccca 
aaatgaacat 
gagcaaactg 
ttgggctcac 
acctcgtcac 
ttcacacgcc 
caaatgcaac 
gttgctttac 
cagcaatggc 
cgtccaggag 
gactcttcca 
acttctactg 
agcatgccat 
tgtcctatgc 
aggcattctc 
tttgcatgta 
tcttttacac 
atgtgaagca 
ctttgtttcc 
ggtgagtttt 
gggtaaaaat 
atgagctctt 
tgtatttaaa 
aggagcactg 
tacaacggca 
attctcgcct 
aaatccattc 
tggtgatggc 
gaggggaaga 
ttacataaac 
cctgattttg 
aatgaaagaa 
tccttctatt 
ggcctgttta 
tctggcatct 
aaagaattat 
tgcccaggtt 
aagtgaccct 
tcggctaatt 
ttaaactcct 
gcataaatta 
caaaatacta 
tagaaaataa 
tttataaggc 
gaaaaaagag 



29 

aatgggcaaa 
atgtgaagag 
gattaaaaga 
agtcattgct 
ttgttcattt 
tcttggaaca 
ccacataaga 
tgcttctaca 
gctacatgga 
tttttgcatt 
gaacttcact 
atgatacaat 
agagagaatt 
gcctttaaag 
gaagagatgt 
gatagataga 
gtagagaaag 
cttacaaaat 
gaaatagata 
taacatggta 
tcaggaaatg 
agattgcccg 
cctgctaggc 
catgtacttc 
cccgcagatg 
acagtgctac 
ctatgaccgc 
agtttgcatc 
ggccatcctg 
tgctgacccg 
gttcatatct 
cttcattctt 
cacctgtggt 
tataagacca 
ctttgtgagt 
ggccttgaag 
taataaacat 
aatattctct 
gagatttttc 
ccttggaggt 
tatgtgtttt 
ctgtgggtga 
aacaaaataa 
gctgcttcag 
ttctgtactc 
ttctggtttt 
gaatgattgt 
ccaccttccc 
aaaaatttaa 
ccgagtaaac 
tgtaattaag 
tatgtgtcca 
tcaagaaact 
ttaaggaaca 
ggagaacagt 
cccgcctgag 
tttaaatttt 
gggctcaagt 
ctgtgcctgg 
aatattttct 
aaagataaaa 
ttctgcattt 
ccccaagcta 



aattaaaaac 
aggtaaaaag 
ttaaataaca 
ctaaacatcc 
atttggcata 
cttgcattag 
atttctgcaa 
ctaagtttca 
gataaatctg 
ttaagaaaat 
ttgtccctgc 

gggggctaaa 

cagtgacacg 
tgggtcatgt 
agctcagtca 
tagatgatgg 
aatatattag 
tatagtaagt 
actttcgatg 
tgtgcatttg 
tccaacacaa 
gaactccagt 
aacctgggca 
ttcctcacta 
tcgactaata 
attttcattg 
tatgtggcca 
tgcttggcca 
accttccgcc 
ccgctcatta 
gctggcttca 
gctgccatcc 
tcccatatga 
ccaacagata 
ccggtactta 
aatgtcctga 
taaatcgaaa 
gtgagtctat 
taggctttgc 
tgttacacgt 
atccaaactc 
aacgtggcat 
atgtgctccc 
cagagatgtc 
ctttcctgac 
tataagtgct 
acttttctta 
attctgagcc 
gcaggctgat 
aataggtcat 
tttaggtaag 
aggaagtaag 
ttagaaagta 
agtaaaaaat 
ggcatgatca 
cctccctagt 
tttgtagatc 
ggctctcctg 
tcaagattct 
aaatatcttt 
tattttgtgt 
tgttcctgtt 
tctaagtcaa 



ttctcgaaag 
ccaaaaggat 
actctgaaaa 
tcaagatact 
tgtttattga 
aatcctacag 
aaacctgtga 
ataatcactt 
tgagtaaaaa 
ttagtataaa 
tctcaacacc 
taataaatta 
tattaggaga 
agcatgaata 
tatagataga 
atagatagat 
taatgtaatg 
ctaccataat 
atatatgaag 
tttatattgg 
atggcagtgc 
ctctgctttt 
tgataatgtt 
acttagcctt 
tcgtatctga 
cccttctact 
tatatgaccc 
catttcccta 
tgaccttctg 
agctttcttg 
acctctccag 
tccggatcaa 
tggctgtcac 
agactgttga 
atccattgat 
gatgaaatat 
tctttggctc 
gttgagtgtc 
tcctccacat 
acagaatcaa 
ctagatgatt 
gccctggaac 
agcccaattt 
tttaagaaac 
ttgctgtggt 
cttacatagt 
aacttgagtt 
tccagtgtat 
ctgagaataa 
acgatagagt 
taatgttaat 
gatttcttcc 
gaaagaattc 
atattttttt 
tattaataac 
agctgggacc 
tgatatctgt 
cctcagcctc 
tctactctcc 
attggaaggc 
atatattaag 
gttgtttggg 
caacggcaat 



79020 
79080 
79140 
79200 
79260 
79320 
79380 
79440 
79500 
79560 
79620 
79680 
79740 
79800 
79860 
79920 
79980 
80040 
80100 
80160 
80220 
80280 
80340 
80400 
80460 
80520 
80580 
80640 
80700 
80760 
80820 
80880 
80940 
81000 
81060 
81120 
81180 
81240 
81300 
81360 
81420 
81480 
81540 
81600 
81660 
81720 
81780 
81840 
81900 
81960 
82020 
82080 
82140 
82200 
82260 
82320 
82380 
82440 
82500 
82560 
82620 
82680 
82740 
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ttatgacaaa 
ggtaatcaac 
ttcatatcca 
agcaaccctg 
ctctaaaatt 
tgagttttat 
agaggtagtg 
tgtttgtttt 
ttggctcact 
gtagctggga 
atgggttttc 
ctcagcctcc 
attttagggg 
tgtggcatta 
gaagtgggat 
aatttcaggg 
atgtaattac 
ctttttctca 
tgacttttgt 
agaaatttta 
tgtgaattgg 
atggagtatt 
tccaaaaagc 
ataattattg 
ggatataatc 
agcatgttcc 
tcttctcttg 
acttaaaatt 
aacaattatg 
tatttaagct 
gaattcaaag 
cttaacataa 
gagatggccc 
aaatacaaat 
tagaagggat 
tgatttcctg 
aaccaaaaga 
aagagatagt 
cagtcgttcc 
aatgaaaatt 
gttgcaaact 
atccctggga 
cagaactaga 
attcaacatc 
aaaataataa 
ttggaagcat 
tttaacatgg 
atccaaatag 
ctagaaaacc 
aatcaatgta 
aatcaggaat 
actaactaga 
agagatgaca 
cttgattgga 
aaggagatta 
ggcacaatct 
aggttagact 
ctcaaacatt 
agcttgcaga 
acaaactccc 
atatatatat 
gctcatagat 
gattcaatgc 



tatatggtta 
caaaatgctg 
tttgttacta 
ctcatttgtc 
tacttcattt 
ttatttttcc 
aaaacaaata 
gagacagagc 
gcaacctctg 
ttacaggcac 
gccatgttgg 
caaagttctg 
gaacatattt 
ttaagattcc 
tgttttatta 
gcataaggaa 
aattaacaac 
tctgatcaag 
aggtagaagt 
cttaatttca 
ggtggtggct 
ttttctttaa 
aaactaacta 
caaaatttat 
aagattcttg 
aggcagtgta 
gataacaggc 
aagattgtga 
atggtgcaca 
ggcatcctag 
cagtaattag 
caactaaata 
aaaaaacaaa 
aaccatcaga 
ggataaattc 
aacagaccaa 
aatcccagta 
accatttgta 
ttaaggccag 
ttaggccaat 
taatccatga 
tgtgaggttg 
ggcaaaaact 
ccttcatgtt 
gagccatctg 
tccccttaaa 
tattggaagt 
gaagagagga 
caaaaacttc 
caaaaatcac 
gcaatccaat 
aagatgaaag 
caaacaaaag 
ttgaaggatg 
acatttgagt 
aatcagctga 
ggcttagcct 
gaactccatg 
tggcctattg 
atatatatat 
gtatatccta 
aaggaaaatt 
tattcctgtc 



tatttcacag 
aactctaagg 
ggatacataa 
ttaattagag 
catctgcaaa 
ttcatcattg 
tcaaggcctt 
cttgctcttg 
cctcccgggt 
ccaccaccat 
tcaggctggt 
ggattatagg 
aatgtatcat 
atttaaaaat 
agtatcaaat 
gattttccca 
catgtatttc 
gcaccatggc 
aagcattttc 
cactatattc 
gaaggatagt 
tgaagaaata 
tagtcagatt 
taattttaaa 
ttttcaatat 
ctgagtgttg 
tggaggggaa 
caagtgtcat 
gattttactg 
agatgattag 
aactaagcac 
atataaggag 
attataaatg 
gactattatg 
ctggacacat 
taatgagctc 
ccagatggat 
ttgaacctat 
catcattctg 
atccttgatg 
gaacatcaaa 
gttcaacata 
atgtgttatt 
aaaaactctc 
tgacaaaccc 
aacctgcaca 
gttagccata 
attcaagcta 
ttcagctgat 
ttgctttcct 
tcacaattgt 
agcactacaa 
ggaaaacatt 
caaagtattg 
cagtgaggtg 
cagtgtgccc 
cccaatctac 
ttcttcagct 
taggaccttg 
atatacacac 
ttagttctgt 
aatattatta 
aaactaccaa 
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cttctaaaga 
ccaagagaaa 
taattgagct 
tagaattttt 
ttaacagatt 
cagacttaga 
gttcctcatt 
tcgcccaggt 
tcaagtgatt 
gcctgactaa 
ctcgaactcc 
cgtgagccac 
caggttgtat 
ttgacatatt 
tatcttaagt 
ctcatccaac 
taatttcaca 
tccaacatcc 
tgcattgtgt 
atttctgtgg 
aataatatgt 
tgctgttctg 
atttggattt 
aaatcacgtg 
atttactcaa 
tgtatacagt 
gatatatctt 
aaataagtag 
taggtagaag 
ggaataacta 
cacatcgaaa 
gctgctagct 
accaaggaga 
agcacctcta 
acccttcaca 
caacattgaa 
tctctgccaa 
gcaaaaaatt 
atatcaaaac 
aatattaatg 
agcctaatct 
cataaagtta 
tcaacagata 
agtgagctag 
acagccaaca 
aggcaaggat 
acaattaggc 
tgcctgtttg 
aaacaacttc 
atacaccaat 
tagaagaaaa 
tgagaattac 
ctgtgatggt 
atccttgatg 
ggaaaggcag 
agaatataaa 
atctttctcc 
ttgagacttg 
ccttgtgatc 
acacacacat 
ccctctagag 
caatggccat 
tgacattctt 



caagagtaca 
ccgtatgtct 
taattatgac 
gtttgtttct 
ttattccttt 
ctgatattat 
tcatttgtgt 
tggagtgcaa 
ctccttcttc 
tttttgtatt 
tgacctcatg 
catacccagc 
atcattagtt 
tgttcacgca 
ttcaggctta 
atagacttat 
tgtgcgggat 
tcatctatgg 
ccataatcta 
tcaaaaaaat 
ttaggcaatt 
cttcatccat 
tatgagaatt 
aaccaaaact 
aaaacattta 
gaatataaaa 
aaacaaacaa 
aataaggcaa 
atgtcttctc 
gaagaagttt 
tgatagaaag 
agactaatat 
tgttaccact 
tgcacacaaa 
agactaagcc 
tcagtagtaa 
attctaccag 
taggaagaat 
ctggcaaaaa 
caaaaatctt 
atcacaatca 
aatgtgattc 
cagaaaaggc 
gtattgaagg 
tcacacagaa 
gctgtctctc 
aagagaaaga 
cagtcaatat 
agcaaagtct 
aactgtcagg 
ataaaatact 
aaaatctgct 
taataatgag 
tgtctgtgag 
acataccctt 
gtaggcagaa 
cgtgctggat 
gactggcttc 
atgtgagtta 
atatatacat 
aaccctaata 
actacccaaa 
catagaatca 



gctgaatctc 
cattgtatat 
ccttgcatcc 
tctataaatt 
ccaatattta 
aatgtcaaat 
tttgtttctt 
tggcacaatc 
agcctcctgg 
tttaatagag 
atccacccac 
cttgctcctt 
tgattttttt 
tctgttgtta 
atgacagtgt 
actgagagag 
gctttgttct 
gatatcttgc 
ctttttaatt 
caaataaaat 
tgggtaaatc 
ttattgcaat 
tcataatatt 
aaacacttgt 
tggataactt 
tacagtttct 
aataaactgc 
caggagagag 
ttagggttga 
gggagaaata 
atctcaacaa 
gaagaaaaga 
gaacccctag 
cgacaaaatc 
aggaagaaat 
gtagcctacc 
atgtacaaag 
gattcttccc 
cacaacaaaa 
caacaaaata 
agtaggcttt 
atcacacaga 
ttttattaaa 
aacatacctc 
tggggaaaag 
accactccta 
aataaagggc 
aattttatat 
cagggtacag 
caaacagtca 
taggaataca 
caaaaaaatc 
tgagtgtcaa 
ggtgttgcct 
aatctggatg 
aaacatgaaa 
gcttcctgcc 
cttgcccctc 
ataatactta 
atatatacat 
cacattacat 
gcaatttaca 
gaaaaaaagc 



82800 
82860 
82920 
82980 
83040 
83100 
83160 
83220 
83280 
83340 
83400 
83460 
83520 
83580 
83640 
83700 
83760 
83820 
83880 
83940 
84000 
84060 
84120 
84180 
84240 
84300 
84360 
84420 
84480 
84540 
84600 
84660 
84720 
84780 
84840 
84900 
84960 
85020 
85080 
85140 
85200 
85260 
85320 
85380 
85440 
85500 
85560 
85620 
85680 
85740 
85800 
85860 
85920 
85980 
86040 
86100 
86160 
86220 
86280 
86340 
86400 
86460 
86520 
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tatttaaaaa 
aacaaacctg 
aaacagcatg 
cataaataag 
ggggaaagga 
gattgaaacc 
acttaaatat 
tggacatagg 
caaaaattga 
aacagagtaa 
aaaggtaata 
attaaaaagt 
taaataatga 
gatggaagat 
atgtgggtga 
aaaccttcac 
gctctgtgaa 
ggaatgaggg 
tttttttttt 
tgcagtggcg 
gcctcagcct 
gtatttttag 
tcgtgatccg 
ctggcctata 
gcattcatgt 
tgtacaggtt 
aagggaaaaa 
gcaaataaat 
cccaacacct 
tattttaccc 
gtgtccaaca 
tgtcattatt 
aatgcaaaac 
caaattgttg 
caattactac 
gtgtattaaa 
cttgttaact 
aggaataaaa 
ttatcagtag 
ctattaaaat 
ttaattgaaa 
caaatattga 
gtgttgaaca 
gtttactaga 
tcaaaagaga 
aatgaaatat 
ctctatgttc 
agaaaagagg 
gcagaaattt 
tatggtcctg 
tagtgaatgt 
cccttatctg 
gaacaagtat 
ggacagatca 
tttacatatc 
aagtacatga 
tagaaatgat 
ttctaaagct 
gacttaaatg 
gccctagaac 
agacactaga 
gacaaggaag 
actactagac 



tttataggga 
gagcatcacg 
gtactggtgc 
accacacacc 
caccctattc 
ggacctgttt 
aaaaccctga 
aactggcaga 
caaatttttg 
acagacaacc 
tccagcatct 
gagaaaagga 
aaacacatgg 
gggagtaggg 
tgaaataatc 
atatacccct 
ggttctgagt 
agacagtgac 
tttttttttt 
cgatctcgac 
cccgtgtagc 
tagagacggg 
cccgcctcgg 
gtaaaagttt 
atctcttttg 
acatcttata 
aaaggtaggc 
aacattttgc 
ggattaaata 
agtagatttt 
cacaattcag 
tttttacctt 
cccatgtgaa 
caatcaagtt 
attttccctg 
aaatacttta 
cattttttgg 
attctgactt 
gagtcaagtg 
gagtttgtat 
ttattcacaa 
acccacttga 
caccttgaat 
ataatatctg 
tcttattttt 
catagacatt 
tcactcgtac 
atactagagg 
gttaaaggat 
taggattact 
tcccataata 
atcactatac 
tatatgtcat 
attcattaga 
tttctttaaa 
ctagtgatga 
gggcacttta 
cataggaaaa 
agtttttatt 
ttaggttcat 
atggaaaaca 
gtgctcctgg 
ttatctacct 



accttgaacc 
ttacctgact 
aaaaacagac 
tacaaccatc 
aataaatggt 
cttacaccat 
actgtcaaaa 
gatttcataa 
gatataaata 
tacagaatgg 
ataaggaact 
catgaataga 
acacaaaggg 
aaaggatcag 
tgtacaacaa 
gaacctaaaa 
ttgaaaaggg 
agataatgct 
ttttttgagg 
tcactgcaag 
tgggactaca 
gtttcaccgt 
cctcccaaag 
ttatttcggt 
ccttcataat 
ttcatatgtc 
aaaatatttt 
taagttacac 
ccgtgaacta 
acaatttaac 
gctgaaagag 
tattaatcat 
taaactcctg 
gcttaagtcc 
gagttgtgtt 
gtgataagag 
agcactagtg 
ctgcacttgt 
acaataaaag 
acatttctta 
caactgtttc 
ataactattg 
aaccactgga 
ttgaaaagac 
ttatggctga 
ttacattaag 
gtggaaacta 
cttcctagag 
acaaattata 
gtagttcata 
aagaaatgat 
atcatatgca 
ttaaaaataa 
aattgtactt 
ttatcagaga 
gtgtagaata 
gaaactattt 
aatcaccata 
gacatatcat 
gcactgtgct 
aagagaaagg 
ctgagaaact 
ctgttttaat 
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caaatagcca 
tcaaactata 
aaataaaaca 
tgatttttga 
gctaggctga 
acacaaaaat 
ccctgaaaga 
tgaagacacc 
aacttaaaag 
aataaaatat 
taaacgaatt 
tacttttcaa 
gagaacaaca 
aaaaaataac 
acccccatga 
aaaaaaaaaa 
cttgacccaa 
gatttaggtg 
cggagtttcg 
ctccgcctcc 
ggcgcgcgcc 
gttagccagg 
tgctgggatt 
atatcattgg 
tagggcaaat 
ctagctaagt 
gatctttatt 
attcaccagg 
actcaactac 
ttctttattc 
aatcacagaa 
tccattcatc 
atccccaaag 
caagtttcta 
ttttagtaac 
ctccaaatta 
tctgtgggag 
actttctctt 
tagcttcaaa 
tctagacggg 
actttacaaa 
gagtaaacat 
gtaaatatta 
cccttataaa 
atagtatttc 
taaaataagc 
aaaaatgttt 
gctaggaagg 
gcaaggtaag 
atatatagtt 
aaatatttga 
tcaaaacatc 
agtaaaataa 
tttctagaaa 
aaatttggag 
gaaaaatcta 
tgtcaaagtc 
tcgaggtaaa 
aggcaagaat 
tagataatgt 
gatggtgctg 
gatattcaca 
tgaagatcgt 



aagcaatcct 
ctataaggct 
atgaaacaga 
caaagctgac 
ctggctagcc 
caactaaaga 
caacataggc 
aaaaacaatt 
cttctgtgaa 
ttgcaaacta 
tacaagaaaa 
aagaagacgt 
gacactgggg 
tgtttggtac 
cgtgagttta 
aaaaaaaagg 
gaccagtatg 
tagactatag 
ctctgtcgcc 
cgggtttacg 
accatgcccg 
atggtctcga 
acaggcgtga 
aaaaatgtgc 
acatgtacta 
gaggattgga 
ctgtttgttt 
ttttccataa 
aacaattttg 
taaccatttt 
acatcaacgc 
tgattggaat 
caatcatctc 
ttgagcacat 
attgcaattg 
caagatgggt 
tacctttctt 
cttgtctcaa 
tgttagttta 
tgatgagaaa 
tatcattctc 
taatatgaca 
ataagacaat 
ggaaacgctt 
attgtatcat 
caggaaaaga 
atgtcagaga 
gtagagagaa 
agggataagt 
tcaaatagct 
cataagggat 
actgtgtacc 
^99g9aaaaa 
acataaaaca 
caaatttaat 
tgaatgcaaa 
tgacattata 
gagactgctt 
aaagccacaa 
gccctgttat 
aagtccattg 
ggagttgctt 
aaaatatcat 



aagcaaaaag 
acaggaagca 
attgaaaggc 
aaaagaacag 
atatacagaa 
tgaattaaag 
aacaccattc 
gcaataaaag 
agaaactttc 
tccatctgac 
aaatgatccc 
aaatgggagc 
cctacttgaa 
taggcttagt 
actgtatcaa 
ttaaaaaacg 
gctataccag 
taaaagtttt 
caggctggag 
ccattctcct 
gctaattttt 
tctcctgacc 
gccaccgcgc 
ttccaatgaa 
tttgatcaca 
tacttggtaa 
tcataggttg 
aatattactt 
aggtccaaat 
gtgaaaggta 
ttatgaatta 
aatgaaccat 
gtagtgatta 
gagtctgtac 
tttcttcaga 
ataacttatt 
ggcatcaggt 
ttttggattt 
gattctcagc 
gtattacact 
accattcatt 
ataaattcaa 
caattcaagt 
tatcaaaagt 
agatggacac 
atattaaata 
agtaaaaagt 
ggaaggatag 
tctactgttc 
agaaaaagga 
atgctaatta 
cactaaagat 
tttgaaagag 
cttatagaac 
attaactaag 
atcagaatgt 
tagattagaa 
tctaaatgaa 
catataacat 
tttatttgtg 
tgactgtgta 
ggctccacca 
tatttgccca 



86580 
86640 
86700 
86760 
86820 
86880 
86940 
87000 
87060 
87120 
87180 
87240 
87300 
87360 
87420 
87480 
87540 
87600 
87660 
87720 
87780 
87840 
87900 
87960 
88020 
88080 
88140 
88200 
88260 
88320 
88380 
88440 
88500 
88560 
88620 
88680 
88740 
88800 
88860 
88920 
88980 
89040 
89100 
89160 
89220 
89280 
89340 
89400 
89460 
89520 
89580 
89640 
89700 
89760 
89820 
89880 
89940 
90000 
90060 
90120 
90180 
90240 
90300 
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gcagttcttg 
aactgaaaat 
aatttctcct 
atacagaatg 
tatttgtatt 
ggctaaagtg 
ctccccagta 
aatatttgtt 
aggctcaagc 
cttgcatttt 
gtaactcaga 
aatttttttt 
ttaccatgtt 
caaacagsct 
aaacatatgg 
actttgttat 
tgaagcagtg 
actttaaaaa 
acaccaatgt 
gcttaattat 
caaaactgat 
gaatacaagt 
ttaaataagt 
agttatcaaa 
agtttaacca 
taaaatatta 
ttgaggtcca 
gatgacgtgt 
ctttagctct 
ttaccaatag 
caacatatat 
ctctagaagt 
attataaatt 
gacatttccc 
gttcttggga 
gtgtgatcaa 
gttatatgag 
tctatctata 
atctatctgt 
aaccatgatt 
cacaatacag 
gctctgatat 
tattctctcc 
tttcctataa 
aattgtggag 
acagcattcc 
agtgctcttg 
ctggagcttt 
acctatttta 
ctattttgtt 
aaaactcaga 
acttgctcta 
tttctttgta 
gttttctctt 
atagaaaatt 
attatttgtt 
aaatgtaaag 
ggtgaatatt 
tttacataaa 
tttaagacaa 
ccatcacttg 
acaaattcaa 
gtgttggaga 



cagacttatc 
aagcttcagt 
agtctcccac 
tttagtaact 
ctattttatt 
cagtggtgca 
gctgagacca 
tttgtagaga 
ctcagcctac 
aactggagcc 
tatcagtaac 
ttagttggga 
taccaattta 
actaaatgac 
atatagatat 
ctagtttatc 
aaacatttgg 
actcactgat 
catggaaaaa 
ttttctgtgt 
tcagcactct 
tagaaaataa 
gtgcactttt 
ggcaattcct 
ttggcaaata 
cttcccagca 
aattatttta 
tattaaacat 
gatgcagaaa 
tatttgtcat 
atacacatac 
gaaacttgaa 
aaaaacattc 
aagtacttta 
agagggctgt 
agaacaggtt 
agtattttaa 
tctatctatc 
ctgtctatta 
atttgttaaa 
attgtttgag 
cacattctgt 
ccatttatgg 
tcaatgatag 
attatgtctt 
ctgctgctca 
ggtctttgaa 
actttacaca 
cctttcccca 
gcttaatatt 
cgatagcaag 
catcagttga 
gacctaaagt 
taagctgtag 
aaattatagt 
aagtataata 
attagataaa 
ttccaaaatg 
tctttatttt 
gaaatcattt 
ggtgccatac 
agatgtgctt 
ctttaagttt 



tatgtcttat 
aatctttacc 
acaattcatt 
atgaaagcag 
tcattttatt 
atcatggcat 
caggcatgca 
tggagtctcc 
caaagtgcta 
tgaagaagcc 
atttgtagaa 
ggatgggaaa 
ccataaatga 
aaataatgtt 
cctgtgaatc 
tttagtaatt 
agttttattt 
gtctacactg 
cccctttgca 
gttatttctc 
ttctaatgct 
gagtcatgaa 
tctaagcaca 
acaggcatta 
aataacattt 
cctggattaa 
cccagtagaa 
ttcacaatca 
atttgaatat 
atgcatgctt 
atgaatatat 
tgttttaata 
cagatctcaa 
gcctttaaaa 
ggataacaag 
aacatgtaaa 
atatttggac 
tatctatcta 
gtttgactta 
tatgatatga 
agtagaatgg 
tttattttcc 
agtgcctctt 
ttctctacta 
acttccttga 
gtgtgaaatt 
taaggaaggt 
tgtccatata 
cttaaaccca 
cagtggttct 
attaacatgt 
agggcttttc 
gcctatgtca 
agggtaggtt 
atctttttta 
tgatgattaa 
tatatttatg 
atgcatcctt 
ctaattattg 
caatttccaa 
acacgcacag 
ccttgtagat 
tgtttgacta 



32 

tgctcagttg 
tctgttttct 
ttataggatg 
ttccgtgcta 
ttatgagaca 
actgggccca 
ccaccacacc 
ctatgttgcc 
ggagtaaagg 
aaacttattc 
atttactttg 
atcttggaac 
cacctctagt 
acttgaaaaa 
aaaatttacc 
ccaatacagt 
tggttatatt 
attcctaatt 
gaaggacatc 
acacagagaa 
ttaatgtgaa 
agataaattt 
gggcacactg 
aacatcacct 
tgctaagtta 
ataccatgaa 
gtttaaataa 
ttcagctatt 
atcatgggag 
ttagtaatct 
atagaggtgc 
tgactaagtt 
agagcttttc 
tatgattcat 
agaaaagtga 
aaattcaaaa 
gttgtctcat 
tctgtctatc 
tgccttacaa 
tgaataaaca 
attcacgtcc 
tcagagttta 
cagtgctcag 
gtggcacggc 
attttgttta 

agggatttta 

tgggacatgc 
gaaggatctg 
gtatttccaa 
tctgaacaaa 
ttacaacccc 
taaaagtgaa 
atgggaggag 
gctgcatata 
tccttttttc 
acagatgata 
ccattaaaac 
ctggagagtg 
cattttcctc 
gtaacaggga 
gcacacacac 
ttcaggaggc 
cttaccgctt 



aggtcgttcc 
atggtctttt 
acataatggt 
tcttattttg 
ggatctcact 
agggagcctc 
tggagatggt 
cagttttgtc 
catgagccac 
ccaaatgaac 
cacttttatt 
tccgaatttc 
tttctgttca 
tatcttcatg 
aacatgraaa 
gcgaagcaag 
tcttgcaata 
acccagaaat 
tggtttaatc 
caatgtatgt 
ttacttggca 
tgctactttt 
cgcatggtaa 
ttgatacatt 
cacattcacc 
ctaactctac 
agaactagaa 
tattttcagt 
ttatcaaaat 
cagttgagca 
ttgtatgtta 
cataatctcc 
tttactgaaa 
gcaagcaact 
gaagagaaga 
gaggaaaaaa 
ggattacatc 
tatctctcca 
ccaataaaca 
aatgctagta 
acctgcaatt 
acattttttt 
tatttacccc 
tgaatgaagc 
gtttttgagg 
gcctaaagag 
tgaaagcctt 
tgatgttctg 
actttttttt 
gttttgaaaa 
ttttccttac 
atgaaacaga 
atactattat 
tttaaaggaa 
tttatttttg 
atatacaaga 
ataagattta 
aggaaagggc 
tcaaaggcca 
acgtcagttc 
tcaaaaactg 
atgttcactc 
attaaatgtc 



tgggctcaat 
ctttggcttc 
gacctgatgt 
ctatctcaaa 
gtgtcaccca 
tcacctcaac 
tttttttttt 
ttgaactctt 
attgcctgac 
atttccacaa 
gaatcaatgt 
acccaaatat 
acaatatatt 
taataaarat 
atatgtgttg 
ttaacattat 
agaaacatag 
ttcagactga 
catcattgag 
atagctatta 
atctatattg 
tacacaaata 
acattaaata 
tcagcaaccc 
aggctttcca 
tacaagaatt 
atgatcttca 
agactaaagg 
tattttaaat 
tatgcccccc 
cactgaggcc 
tgcccaggaa 
tattgagaaa 
aaattagaga 
aactaaggat 
atgcttgcaa 
tatttatcta 
tctatctatc 
cttaataagc 
tttaggattt 
gatacgcaga 
ctcctgtttc 
aagcattttt 
tggtttccca 
ttcagggatg 
aaaatttgat 
tgcccacttt 
caatagagca 
ccccataggc 
ttctaaccta 
tcctaatggt 
agcggtgtga 
ccaaatatga 
agagcaggac 
tcattatttt 
tatattgcat 
tatttactaa 
acttttactc 
aagttgtaga 
taattcatca 
cgtgtcaaag 
ttcttggtat 
ttatttgcca 



90360 
90420 
90480 
90540 
90600 
90660 
90720 
90780 
90840 
90900 
90960 
91020 
91080 
91140 
91200 
91260 
91320 
91380 
91440 
91500 
91560 
91620 
91680 
91740 
91800 
91860 
91920 
91980 
92040 
92100 
92160 
92220 
92280 
92340 
92400 
92460 
92520 
92580 
92640 
92700 
92760 
92820 
92880 
92940 
93000 
93060 
93120 
93180 
93240 
93300 
93360 
93420 
93480 
93540 
93600 
93660 
93720 
93780 
93840 
93900 
93960 
94020 
94080 
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caatagaaaa 
taagacaatt 
tctaggaaat 
cctgattcct 
taactttaat 
gtgatcagtt 
gtttgtgcac 
acagagtcaa 
cactgaagta 
cctacagact 
aacactatct 
ctcttaagac 
gtgacttctg 
caacccactt 
tcttatctcc 
taaaaaatta 
ttgatattat 
tttgaactct 
ttataatatt 
ccactagcag 
aggattaaat 
cttctcttgc 
ataggccctt 
attgcttttg 
aatttccttc 
ctcaaaggtg 
tccatgtcct 
tcctgcctga 
ccaccctatt 
atctaatgct 
tcttccacaa 
ccatgattgt 
tttaggacac 
gaatcatctg 
gccctatatt 
aagagagaga 
ccttgattta 
gaaattctac 
agcccatcct 
ggatgttggt 
ctagtttgtc 
acttcttctc 
tcattgctgt 
tggccatctg 
tgattgctgg 
accacttgac 
tcatccgact 
gatttaacct 
ccatcctgag 
acctggtggc 
cggacaggtc 
tgttgaaccc 
tgatcagaag 
aatctttcta 
agatccccaa 
atggttttac 
tgcatgtatg 
tttacatgtg 
tttttacatt 
taaaaagtaa 
cttcctagag 
ctacaaacaa 
tgactgtcat 



gtcatccctg 
ttctgcatgt 
tcctagttat 
gaaacaatta 
tttcatgctc 
gtcttcaatt 
aaatgtaacc 
agtagctaac 
tgtagttgaa 
tctgttttgt 
ttggtagatt 
cttacaacac 
agtctacact 
cactcattct 
tttagtgttt 
ttgtatagaa 
ctgaattaat 
aggaataatt 
ataaagaagt 
aacaaatttg 
taaaacataa 
tttactccta 
gttattccaa 
attttgttgg 
aaacccctat 
gagacactaa 
gttagcaact 
gctccgcctc 
gtgaactgca 
tgattatccc 
aactggtcgt 
atagtcaact 
tattccctca 
catatgatac 
cttgagtcct 
gcaaataaga 
ttttcatcca 
tttggtgacg 
ctttgtactg 
gttgatcagg 
ctgcttggat 
agacaagaaa 
ggtgattact 
taaccctttg 
tccatatgtc 
cttctgtggc 
ttcctgctct 
ctccagctcc 
gatgcgttct 
agtgactgtg 
agtggaacag 
catcatctat 
aaacgtgctt 
tttatgagaa 
tgaaaaacta 
atgttacata 
aggaataatt 
taacaattta 
ataaaaaatt 
gagattaaaa 
ttacaacgat 
atgtgtggtt 
ggttgctttt 



catctatgtg 
actgttcata 
ttttcaagcc 
gttattttgt 
cttaccatga 
ccccagcaaa 
atattacaat 
ttgtgtgaac 
gacactggtg 
gccctggatt 
tcccataaaa 
aaggtccctg 
gggaaccact 
tattagctaa 
tacttataaa 
tcccagcacc 
attatctgaa 
ctatctctct 
ttgaactaat 
tcacattctg 
atggaccagt 
caaactcctt 
atttagaaac 
tatagcagta 
gattctggac 
agcaggggtc 
gggccacaca 
ccgtcagatt 
tgcgcatgcc 
caacaacccc 
tggtgctaaa 
gaaaaaggca 
ttccaaaaca 
acccagttaa 
gcctgactgg 
gaaagagaaa 
tttaagtgaa 
gaatttattc 
ttcctgctaa 
atagattcac 
ttgtattact 
gccatttcct 
gaatattata 
ctttacagca 
tatgggtttc 
tccaatatca 
gacactttca 
ctcatcataa 
gctgaaagta 
ttttatggaa 
tccaaagtca 
agtttgagga 
ttgaagtaaa 
ctgtatttaa 
taatctaaca 
tgagtgaaat 
attcttttat 
catgtaaatt 
atacattgtt 
tattttattt 
taacagtttc 
atatgtacac 
tagtggttta 
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aaatatgttt 
ctagctaaaa 
ccagttagat 
ttgtattttt 
aaatcaacca 
gcaacttgca 
ggttaaatca 
ccttaattca 
ttgcaaggaa 
aagataaatt 
acaatgaatt 
ctaacataaa 
ctatgtttca 
agggctccca 
ataacacaag 
ataatttccc 
agtataatct 
tgattacttt 
gctgctagaa 
ttatatgctg 
agataaaatc 
cgactgaaaa 
cttatctttg 
agaaaaaaaa 
tctctaatac 
ccctaacccc 
gcaggaggca 
agtcttggca 
agctatctag 
accctccaac 
aaggttgggg 
gttagtggct 
cgttggtaac 
ttggggcctt 
ctgtgtgagg 
aaaggtaaca 
aaatgctggt 
tcttgggatt 
tctacctgat 
gcctccacac 
ccactaatgt 
atgctgcttg 
tgctagctgt 
gcaagatgtc 
ttagtggact 
ttaatcactt 
ttaaggaaac 
tcctcatctc 
ggcacaaagc 
ccctgttctg 
ttgctgtttt 
acaaggatgt 
atcagtgtat 
ctttagtagc 
acaaaacata 
atgtgtgtgt 
tcataacatg 
tacatgtaaa 
gtaataaaac 
tctttcccac 
tttagttttt 
acatatatac 
caattctgta 



aatttctctt 
ctgctcccca 
tattgtcctt 
attattgaac 
gctctttcaa 
tgcatggagt 
tttagcatcc 
atttccggga 
ttgtatccca 
taaaaatcga 
ctgatattta 
gcaaaattct 
cttcagtgga 
tagacattta 
gcaacattgg 
attgtctcat 
gttcattata 
tcattttatt 
acctgaaaca 
taaacccctt 
tatgaaataa 
ttccttcatt 
gtttattcac 
agtgaagaaa 
agcacatagc 
caggccatga 
agccaccagc 
ttagattctc 
gttgtgtgct 
ccccatccat 
actgttgtta 
taatccattt 
acagaaacca 
tgctttttta 
gagaacaaga 
tgtaatggac 
acctaagaaa 
aaaggatctt 
cactgtcggg 
ccccatgtat 
gactcccaag 
tttagtccag 
aatggcctat 
caaagggctc 
gatggaaacc 
ctactgtgct 
atccatgttt 
ctacatcttc 
gttctccacc 
catgtacgtt 
ctacactttt 
gaaacaagct 
ctttattagt 
tttacagtaa 
gacatagata 
gaagggtagg 
catacattta 
ttacatttat 
attaggtctc 
accctgattt 
tttttttaat 
acgattgtta 
gcaattacta 



gaaagttgcc 
ctcttattcc 
tgatgcttac 
taatcatatt 
ggcaagcact 
gttcagtgct 
tgaaagcatc 
3-9a9gttaga 
ctagctcaag 
atgaattgcc 
aaatgtaaag 
attggcctga 
ctcttaaacc 
agttttgatc 
gaatttataa 
ttcattacaa 
atttgtattg 
attctggaca 
aatttgtgtt 
aaatgcattc 
catttggcaa 
agtaattcat 
aaagaaatga 
aaaaacccag 
cagtgaaatt 
actggtatgg 
aagtgagcat 
atgggagcac 
ccttacgaga 
ggaaaaattg 
taaaggaagc 
ttcgacaaac 
ttctttttag 
agaccaccag 
gagaaagaaa 
ttctcattca 
atggttagag 
ccagagcttc 
gggaaccttg 
ttctttcttg 
atgttggtga 
tgctattttt 
gataggtatg 
tgtattcgcc 
atgtggacat 
gacccacccc 
gtggtagcat 
attctcattg 
tgcgggtccc 
agacctccca 
gtaagcccta 
ttttggaaac 
caaataaaaa 
agtcaatttt 
gagaaaattt 
gaggtgtata 
catgtatgaa 
aacaacatat 
tatggaaagg 
tttattctca 
ttcttcgtat 
aaacacaagt 
ccaagaaact 



94140 
94200 
94260 
94320 
94380 
94440 
94500 
94560 
94620 
94680 
94740 
94800 
94860 
94920 
94980 
95040 
95100 
95160 
95220 
95280 
95340 
95400 
95460 
95520 
95580 
95640 
95700 
95760 
95820 
95880 
95940 
96000 
96060 
96120 
96180 
96240 
96300 
96360 
96420 
96480 
96540 
96600 
96660 
96720 
96780 
96840 
96900 
96960 
97020 
97080 
97140 
97200 
97260 
97320 
97380 
97440 
97500 
97560 
97620 
97680 
97740 
97800 
97860 
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tcagtctcat 
tgtttctcta 
atgtctctcc 
ggacatgagg 
tggctcctat 
attcaattct 
taatctattt 
acagtggtac 
cttgtttgta 
tgacaaatct 
tatctacaat 
catagtttta 
tgtaagacta 
gaagctagcc 
gccctcccat 
aaccaatttg 
ctataacaag 
tctttgtacc 
attttatatt 
taccttgatt 
aaagctattt 
atacctagat 
ttaattatat 
gcaccgaaga 
taaaagttaa 
agaaaagtgt 
gccaacttac 
tagcaataca 
acaccaagtg 
actatctaag 
tttggtatta 
tttcttattg 
agtctccaat 
ggacaaattg 
ttttgaaaag 
actctgatat 
agtagtgatg 
aataagtgaa 
aatacttttt 
atcaacacta 
agttgttaac 
aaagtttatt 
aaaatcatac 
atttgacagc 
gtatagataa 
catttattag 
gttatacaac 
tcataaggat 
tactggatac 
tactttacct 
tttattactc 
ggtcagaaac 
atgtgatccc 
atagatagat 
tgtaaggtaa 
tttattcatg 
cagaataaat 
gttgcatctt 
cttggcccta 
acaaattctg 
gcataacatg 
tcacatccac 
cgtttatatg 



ggtttagagg 
ggagagaaaa 
acttgtgaga 
ccagcccagc 
aagggtatat 
acacttctgg 
ttcaggatag 
tttgtcacag 
agtattttag 
ggtgactttg 
aagtcacact 
caggggtcat 
tggaggtgct 
ctgtgattgt 
agctcctagt 
tgatcaaata 
tcacaactca 
ttcgagttaa 
attgcatata 
ttgaaagtca 
ttactaattt 
gtattcctgt 
aaaattttct 
aaggttcaat 
tgaatgtcca 
ctttgaacag 
ttttcagaaa 
aaaccaatag 
ctttcatgaa 
tccatatcat 
atatgttatt 
atgacattaa 
ttcatgtctc 
ggtaatacaa 
aacaataaat 
tcctttcagg 
ttatgagcct 
aaatatttag 
attatagttc 
tgcattcagt 
ctaaaagcac 
tgtttgtttg 
aatgaaaaat 
tgcatgcaca 
atttaacttg 
ctaaatgagt 
ctctctcggc 
gctgtcttca 
atagtattta 
atcttagact 
tttggatata 
aatgatgttt 
taaaaaatga 
agatagaaaa 
tgggattttc 
tataatcagt 
caaaccagag 
gtcttctttc 
aatctgcaca 
gttaactttt 
gtcacattcc 
atggtcatct 
gctgtccttc 



aatgtcacaa 
cttacgataa 
gtcactttca 
agttattttt 
gactggctgc 
ggactagagt 
atcatatatt 
ttagttatta 
tgctaaagga 
cagttgataa 
tctcctaaca 
gaaactgtaa 
aacgaaagag 
ccaagactag 
atgacgtaat 
tgtagatgtt 
aaaagacatg 
tctctgctta 
attattttaa 
gattgcccac 
tttacattta 
ttaaatacac 
gatctcaggg 
gactgcagca 
ctttcaaagt 
accagcagat 
tacaagtaac 
ttacgcccat 
ttgactggct 
attgtattag 
ttagtaggcc 
aatcccatta 
ttttctttca 
tgaagaaata 
atgatatatt 
aaaatttata 
aatgtcttta 
acactcacac 
aggtaatatt 
actataatct 
ataggtagat 
tgttactatt 
ataattattt 
ctgaaggata 
ttccataaga 
agttatgagc 
ctcagtgccc 
cagaagatat 
ctcaataaaa 
taggttcctt 
attaatgtct 
tgagagcaat 
tagatagatg 
aaagaagtaa 
ctgcaaaaca 
ctgcttttgt 
agaagaaatt 
tcttggttct 
tcacatcctc 
tatttggcaa 
cgttaaccca 
gccttccatg 
tctatgtttg 



34 

ctgctggtga 
cacctctgtg 
ataagacacc 
caaatattaa 
aaataactgt 
ccttcatgct 
tttattagta 
tatacctaat 
agcaaaggac 
cataagcaaa 
ccaaggactt 
acagaaagat 
aggaaagaaa 
aacttgttta 
gtcaggtggg 
agtttcaatg 
atattttttc 
gtatttctac 
accttgtact 
aaattctgtt 
attaacaatt 
atatgttttt 
cccatgcttc 
tgaaagagta 
aaacaaagtg 
gactctgagg 
cataatgtca 
atacacaaaa 
cgctggaatc 
acataaaata 
ctggatttta 
ttaaagtctg 
aagttaattc 
atcagaaaaa 
attttctcct 
catgtgtata 
ctcatataac 
tcacacatac 
ttaatgtaag 
ccactttaaa 
tagaagggga 
atacttttgc 
tctatcttcc 
taatgcacta 
caaatttttg 
atgaactctg 
tcttctataa 
taattaacac 
ggtacctgtt 
tcataacctt 
gtgttctcta 
tattttctat 
attgatgggt 
aaaaaataat 
aaacactgat 
ctccttcaac 
tatcttcctg 
attaatttgt 
acactgtctc 
gttcaattcc 
gtgttttgtc 
tagtcacaat 
catttcatgt 



gacttgctgt 
tccatgcctg 
tttctccttc 
ctctggaagc 
gacttctgca 
ttctctcacc 
gcatcattca 
tttagaatga 
aagttgatat 
attgatgaat 
ttcaaggagc 
caatacaatt 
aaattatctc 
tgctttatgc 
gtctgagcaa 
aaaaagaaaa 
tcagttgtat 
agaaattttc 
actacatgtt 
aaaacacaca 
gttgaattcc 
agttatgtga 
tatctgtaca 
cacatcatac 
cctcaatttt 
gaatatgtta 
ttgatgttta 
gttcaagata 
acgtctatca 
aatatgtgct 
gctaaaaagt 
tcataagaga 
tgctgaattc 
ataaagaata 
cacctatcca 
aaatactaaa 
agatgttttt 
acacaaaaat 
ttgttttact 
gaaaagacaa 
gctcaaatat 
cactatactt 
agaagtattg 
taaaaaattc 
cagtcagatc 
gaatctgatt 
agtgaagaca 
atgaaacaca 
ttaattatta 
ctatgttttc 
taatattgta 
ttttcagcct 
agatgacaga 
ggagattata 
ggtaacttaa 
agaattcatc 
aggttggcca 
atttttcttt 
cttgccaatc 
atagccaatt 
cctgcagcca 
attgggtcca 
aaacttattt 



ccctctaggc 
acacccactt 
tgccacgaca 
agacaaagaa 
gaacttatca 
tctgcttttt 
aattttactc 
catatgtcga 
tttaaggatg 
gtgtgctgag 
tcgacaatgg 
tgttaaatgc 
tgggcaaatt 
accttcctca 
aataattaag 
aaccataaaa 
ttcaaagctt 
agactcattc 
cctatttaaa 
tataaataaa 
gcttagacac 
aaccagcttc 
gcctagcttg 
acaattttat 
agaatttata 
ttagtcataa 
tgactgactt 
gaaaggaatt 
gcctgttaag 
gatgaaaatt 
aaagaataac 
tttcaagaag 
tttatttgtt 
tttactgtac 
aaagatcatc 
tgttataggg 
gggaacaatg 
aagacagagt 
taatctgccc 
cagatttaca 
gcctggcttt 
cattgaaggc 
ctgatcatgt 
tttttataat 
ctaactccat 
tactgagcaa 
tagtgatttt 
cttaggacag 
actggaaagt 
ttttaataca 
agtgacatga 
ggtatgtagt 
tagatgatag 
atattagttt 
atgtgtatat 
tggaaatgtg 
gttgcccggc 
ttctggaaaa 
tctcaatcct 
tatatgaaca 
tctgtggcaa 
aagtgctcca 
aacatggtgt 



97920 
97980 
98040 
98100 
98160 
98220 
98280 
98340 
98400 
98460 
98520 
98580 
98640 
98700 
98760 
98820 
98880 
98940 
99000 
99060 
99120 
99180 
99240 
99300 
99360 
99420 
99480 
99540 
99600 
99660 
99720 
99780 
99840 
99900 
99960 
100020 
100080 
100140 
100200 
100260 
100320 
100380 
100440 
100500 
100560 
100620 
100680 
100740 
100800 
100860 
100920 
100980 
101040 
101100 
101160 
101220 
101280 
101340 
101400 
101460 
101520 
101580 
101640 
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tatccccaga 
ccaaataaaa 
atgagagttc 
cagaaaaact 
cttgaaggaa 
caaccaaata 
ttttatgtgt 
tgggtgtgtc 
taaaggagag 
agaatagaaa 
ttcgcctgct 
gacttaaact 
tttcctggtt 
atgagccagt 
tatatatata 
ctctggagaa 
ttttaaggat 
ttagatttaa 
cataaactgt 
ataagatata 
gattataacg 
gatgagcttg 
ggagatcctt 
cgttgggggt 
taagtgctat 
tccccgttcc 
ccagaataaa 
ttttggggcc 
caacagcccc 
gatgagtttt 
agcatatggt 
agctcattaa 
aagtttggtt 
atgtttaact 
attcaagagt 
aagcttttca 
ggaaaatcaa 
tttcaatggc 
aaatagaata 
tggtgttccc 
tgcaaaaaca 
tacagtccct 
gaaggggaga 
ggacctttta 
cttgacactg 
cagtaggggc 
gtccctgaac 
tactggcaga 
gaaataattt 
aagctttttg 
taggaatttc 
gatcaagaag 
taataaagtt 
atttagactt 
gttggaaagt 
aggtcacaaa 
tttaaaaaag 
ttcttgccct 
agcaccatgc 
ctataaatta 
ttggtactca 
tgagtaatga 
catgagtgga 



gctgacgaca 
ccaaatttca 
taactttata 
tctctgtttt 
tcaataacac 
aatagaaata 
caacttgact 
tgtgaatgtg 
tgttctcacc 
ggcaaaagaa 
cttggacaat 
atttgttccc 
ttccagcctg 
ttccattgtg 
tatatatata 
ccctgatgaa 
atattttctt 
agatgctaat 
ttatacatat 
aggaactatg 
atattgattg 
gggattttaa 
gtctcctgta 
ttgtgggaag 
ctgacagtga 
cagtggtatc 
tggtaaagac 
cacctttacc 
tgaaagaaga 
ctaatttata 
aaaagagtag 
acagagattc 
gtttggctga 
tcccttggtt 
ggacttgcca 
ccaatacatt 
aatgcagtaa 
ccaaggcaag 
gtctgaccca 
agttgtataa 
ttttaggtca 
caatcaattt 
caaggtcttc 
tctgggtaac 
gctctgaact 
ttatgatgat 
tcagcctgtg 
atccccatgt 
atttaaagtg 
ggtaattctc 
aaagggtaga 
aatagcttgg 
gaatagtaaa 
aattggatct 
taatccccac 
agttacaacc 
caattataaa 
ctgccttcct 
attttgactt 
cttagtctgt 
gaagtgtggc 
gtagaggcca 
ggtttaaggg 



tggaatagca 
ccacacgcca 
ttttccaaac 
tgaatttcac 
aaaacagtag 
gaaagaacgt 
gggcgacagg 
tgtctggaag 
aatgtgggct 
gagcagattt 
gttggtgctc 
ctaccaggtc 
caggcggcaa 
tgtgtgcgtg 
tatgtaaaat 
tacagatttt 
atctgtttct 
gactatttag 
atgcaaagta 
tgactctcta 
gttactctta 
tttccagtcc 
gcttcaatta 
atcctgatga 
aagaggtttc 
agcttttcca 
ttcccctgag 
acccttcttt 
ggtaccaagc 
caaacagaaa 
aatgaagata 
tccatttaat 
aacatgaatc 
taatgtagag 
ttgaagacct 
gataaataaa 
tagtaactag 
gtgggcatag 
cacgtatcta 
gtaaatagaa 
agtgaacaaa 
gtagacttga 
tcaaggaaga 
tgcattggag 
gacaatttca 
tagattatca 
gttatttctc 
tggtcctcct 
ttttgtgagc 
ccagagaagg 
ggatgaattg 
ttgtgcttat 
ctagatcaga 
acatgatgtg 
cgtaacagtg 
ttgtgggatt 
tgggcttgag 
ccatgggctg 
ctcggttacc 
gttattctgt 
tgttctatat 
aaagaatttg 
tgattatggt 
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tcatatcctg 
tgttgctact 
aaatttggaa 
cttatgaata 
ctgccctata 
aacgtttgaa 
gtgcacaaat 
agattttacc 
ggcatcatgc 
gttttctctg 
ctggacctcg 
ttcaaactta 
atcacaggac 
tgtgtgtgtg 
cttcattcta 
ggtattgaga 
gaggtttctg 
agtagtcgaa 
tccacattgg 
tatacttttg 
atgtcactgg 
aaccctgaga 
gagaattgca 
agctggaaag 
cctactccta 
cctatgtctt 
gcagtttcca 
gcttctaaac 
aagaccatga 
tccagggaac 
tagttggatc 
gttgcaactt 
aaaagatggt 
gaaggaattc 
actcacccat 
cttgtgaggg 
atcatggggt 
ttaccataat 
tgacattggc 
agcccactaa 
agtctattct 
ggcagtttac 
tttccatata 
aaaaggagat 
ggctcacaac 
gtgaattcca 
tggttctaaa 
acttgtggag 
acaaagaaag 
taaccactga 
caaagacttt 
gacacaaata 
taaggcaaga 
gtgatttgaa 
ttaaaaagta 
gataaaaagg 
gctgcaagtt 
acacagcatg 
agaactataa 
tgtagcagca 
cagctgaaca 
caggaacagt 
gagggccagg 



ccaccttgtg 
ccataatgtg 
tctttctact 
tgtctagata 
tgtttgtgac 
aatagtagca 
acttggtaaa 
atttgaattg 
aatctgctga 
ggtgagttgg 
gacctgcaga 
gactgaatta 
tttttggcct 
taatcatgta 
tatatagaat 
gtggttctag 
gaattggctc 
aaagcattga 
attatcctca 
tactttctta 
acaaagtgct 
gcttctatgt 
gaaagggatc 
actgagcccc 
tactcctagt 
gggtaattaa 
tacaagacaa 
ttataactaa 
ggtggtgcac 
atgtgtagga 
aggtctaatt 

ggggagttag 

ccactgtgag 
aaaggtctag 
actggaaggg 
gaaactgcaa 
ggcagaggcc 
gaacagcagg 
tagcacaggc 
attcttactt 
gaattataat 
agacccagaa 
ttgctaggtc 
aatcagactt 
atcattatag 
tctcaaagtg 
atgcataaat 
cttgggtaaa 
aaaaaaatta 
gatgggtttt 
tcctaaagag 
atagatagga 
atgaaatgct 
cgtatcctca 
gggcctaatg 
ctaaataaag 
ctctaacttg 
aaggccttta 
aacaggtaca 
taaaacagac 
tgtgaaagtg 
ctagaaaaag 
aaatgtaggg 



gctactccga 
atggaagtgt 
ctttgtaagc 
atgttacctc 
tctttttgaa 
tgatggttaa 
atgttatttc 
gaagactgag 
gggcctgaat 
gacatcaatc 
ctctgaccag 
aaccaccagc 
ttatagttgc 
atgttatata 
gcttctcttt 
agcaacataa 
tttaatttca 
tagtccatga 
tcagacactt 
gaaaactaag 
gaaagataag 
atgtctggaa 
ctatatgttg 
taaattctga 
ggaattggcc 
ctccatattg 
tacagattct 
acccaagtac 
taaaaaagta 
atggatacta 
tgtcaatatg 
aaaaagctct 
taaactggaa 
agaaattaaa 
ttcagaagac 
ccacacaatt 
aggtggtggc 
ttaaagcatc 
tagttaatta 
gatctgaata 
aactgatagt 
ctccttaaat 
ccacagagag 
ttgagtacta 
ccccttcagt 
ggcccagtgg 
agaatagaca 
agaaatagtt 
tcttgacaaa 
gaaagttgag 
tgaaccaact 
aagtagtaaa 
tttttaacaa 
aaaataatgt 
ggatgtgatt 
gattactttt 
ctgtcccctt 
gatagatgtc 
tttctattta 
taagagaaat 
gccttagaac 
cctagaatgc 
aaagtctgga 



101700 
101760 
101820 
101880 
101940 
102000 
102060 
102120 
102180 
102240 
102300 
102360 
102420 
102480 
102540 
102600 
102660 
102720 
102780 
102840 
102900 
102960 
103020 
103080 
103140 
103200 
103260 
103320 
103380 
103440 
103500 
103560 
103620 
103680 
103740 
103800 
103860 
103920 
103980 
104040 
104100 
104160 
104220 
104280 
104340 
104400 
104460 
104520 
104580 
104640 
104700 
104760 
104820 
104880 
104940 
105000 
105060 
105120 
105180 
105240 
105300 
105360 
105420 
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acttcttaaa 
aggccattct 
aggccatctt 
tctgtggaat 
gaagcaaaaa 
agaaagaaaa 
actcttagcc 
tgttcatgta 
gaaaaattgt 
gtatgtttag 
catctccttg 
ccttagtcat 
gtccttcccg 
atagccagtc 
gcagtaggga 
tcgccatcgc 
aacttttgcc 
ttgctacaga 
ggatgaccct 
atccagagag 
gtatgtagta 
aactaatgga 
aaacagggtt 
cgtaataaat 
ttgtaatcaa 
catgttagtg 
tcactcaaca 
caatatataa 
cataaaaagt 
cccaaatttt 
attatcctaa 
tctacaaaca 
gacttttttt 
atgtttttgc 
cagttctttc 
aagcaataag 
gccattttaa 
gaaatgactg 
aaacacaagc 
tttataggta 
ttatatttag 
taacttcata 
atttgttcta 
tgtgcctcac 
tccttcaaca 
acaattgtag 
caaaactttt 
tagcacaaaa 
ggaaggtggg 
tgttctccag 
gaaaacatat 
tacacatgtg 
ctttatatac 
atagttccta 
gcaaagttgg 
aataaaaata 
gtgagaaata 
agtgctgtaa 
gaaatataat 
atcttctact 
gtaaacatta 
aggtctcaag 
accatagtcc 



gactacatgt 
gaaaagatct 
ttttatacag 
acagaactta 
attaaaggtg 
aaaagataga 
tggccatgta 
ggatacagta 
taagtacaaa 
cttccctgtt 
cccctagttt 
cctcagtcac 
ccgaaactac 
agaattagct 
ctagctgctt 
tccatccttg 
tgagtgctag 
gatgaacata 
gaaagcattt 
atacataaat 
caatcatata 
ctagcttgtg 
tctcagtaga 
ttaccagata 
tgtttaatca 
gctagagaga 
gctttttgaa 
tgggtctaat 
tctgtacatt 
aacatctgta 
tcatttgcac 
aagcaacaca 
tttttttcaa 
tgcattagtg 
tgccaagtaa 
taaagtcatt 
aatccatgtt 
cttctttgtg 
aaacctttcc 
gggctgtcta 
aagccaaata 
ggagtatggg 
actactgaac 
atgttgtcag 
acattttgaa 
agctaacatg 
gagttggttt 
taattttaat 
atgtaaggtt 
aaaaaaagat 
gcatgcacac 
tgtattatat 
atagtcaggg 
ttacatgaac 
agctagagta 
tagagaaaaa 
ggaggggagt 
aataaccggt 
agatttaatg 
tgccttccac 
cagatttcta 
ttgctgacat 
ttgaatataa 



gtggctatca 
caggtgaaac 
tggcaaataa 
agagtggtga 
gtgcacagct 
atttgtaatg 
aataataaaa 
aattcctctt 
gagttctgag 
ctttgttctc 
cagtaaacaa 
ctgctctgtc 
tcaccctgcc 
tagactgtgt 
taggataagc 
agatgcaccc 
tttcactttg 
aatagaagga 
caaagatctt 
atatgtttag 
ttttaacgat 
aaaacaggat 
actaaactgc 
ttgcttccgt 
tgaaattaga 
ggagattcat 
tttgagaaaa 
aatatttatt 
gaatttataa 
tgtggtacat 
tctagatttc 
aaataaacaa 
atgaatgtat 
gccaagtgaa 
tggagggagt 
accaagagat 
caacatcact 
atttcttttt 
ttatttatta 
tttgtgtgtg 
cttattcagt 
tggttatata 
aaatgggact 
atctttacat 
gtgagaaaac 
cagcatagtt 
tgggagatga 
gttttggaaa 
aggacatgaa 
gaagaaaaga 
tgaaataatt 
atgtctatgt 
gaagagagag 
agccttttag 
gaatttgaga 
tacagttttt 
agaattaaga 
gacaaatcct 
ttttaatttt 
agagttgcaa 
aatggcgtgg 
caggattatc 
tctctcacag 
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tcaaagtgct 
tgagaaacaa 
attggcagaa 
gctagactat 
tattttgggt 
aaaaggaagc 
aagcatgttt 
caaagtttag 
ttccacttca 
cattttaaag 
ccccctccta 
cttagttatc 
actctggctc 
ggtccaaccc 
acctcttccc 
ttctatagaa 
tagcaccgaa 
atccaggttg 
gtggaaaagc 
tttttaactg 
gtggaaactg 
ttctcgtctc 
ctttccttgg 
gtacctggaa 
tgaaataaca 
agtcaaagct 
ttgaattttc 
tcatattttt 
aaacttttta 
catggcctag 
ttcaattata 
agaaaaagaa 
gctcaccgta 
aagagtgagt 
gaggtagaac 
gtgaaaaaga 
gagctgtctt 
aatgcaacag 
tctctgaagg 
tgtgttagga 
agaaatgatg 
ggtatcaatg 
aagaaagtaa 
agattataac 
taaagctcaa 
gtaatttttt 
ataccagtac 
gtacaaatgt 
tatatagaag 
ttgcaaagca 
aaaaactaca 
gtatacatac 
agaggttaaa 
ttcgtcaaag 
ttaggagacc 
gaggtcattt 
gaaaccaggg 
ctatgattag 
aatttttatt 
tattatctgt 
aatagatagg 
taagcattat 
attaatttaa 



ggtagaaata 
tgtactggaa 
ttgtgtccat 
ctggaagaag 
gcttacagta 
agcatgaagc 
gggacagaat 
cctgttaact 
aagaaccaat 
tttaacctcc 
gcctctatca 
tagtcatcta 
atacccctgc 
cagccaatag 
ctcccttgta 
gtaaattgcc 
catttacttc 
gatttatcaa 
tgggattttt 
gaagtgagct 
aaatagggtg 
aaaaaaaaaa 
ttttatttct 
agcatagaca 
gcaagtgaag 
gctttcttct 
tttctaaggc 
aatatttaca 
ttttaagtat 
gaaaataaag 
ttcatttact 
ggggagcaaa 
atgataccgt 
aaggtgttcc 
gatgcccagt 
ggacacagaa 
cagggaacat 
ggaaaaggag 
cccattggtt 
actagacaaa 
catgattaga 
cacacaatct 
aatttcttgc 
ttttaatcct 
aaaagttatg 
aagtcaaata 
taaggatgtt 
acatggcaga 
agtgcaagat 
tctaactacc 
tgtgtatgca 
aatatacata 
tcaccatcct 
caccttaatg 
ccaaaaggtg 
tagaagagaa 
aggtatcatg 
gcaacaggta 
ctcttttgca 
gtctgtgtgt 
agttttaaga 
ccttacattt 
ttctaactaa 



ggggtagcaa 
actagagtaa 
gtagcagggc 
aaatatctgc 
aaatgagaga 
gatttagaaa 
actaaggaag 
tcaagaagga 
caatatgtca 
tcgttcttta 
cctgttctgt 
cactgtaacc 
tctctttaaa 
gggaaagaca 
cggtgtgctc 
ttggtgagaa 
caacaatcat 
gacaatggga 
agggcaaaga 
gggatgtact 
actagcctaa 
aaaaaaaaga 
agttaaatgt 
ttttgattca 
catgttagtg 
ctcagtgagc 
tcagttcctt 
taaaattata 
acttaattac 
aataataaaa 
cttggacttt 
aataccctat 
gtggtagaac 
tttttgagtg 
gatacacgat 
gagatgttaa 
ttaattggca 
tcaactttat 
ttcctgcttc 
gtgttgctat 
aatagttaaa 
gagaatttgt 
aatggcttta 
caaaagatat 
taagctgtct 
aggcaagttc 
ggaaacattt 
ctatttctaa 
gatgttccag 
caatcctcat 
tgcatgcatg 
gtcagggaga 
tctaaagata 
tgttcactag 
aagaacttag 
ggaaggagag 
aattgggtta 
gaaaatttga 
acaaatatct 
ttgcatactt 
tgattttgaa 
tataagtcat 
taacaatcac 



105480 
105540 
105600 
105660 
105720 
105780 
105840 
105900 
105960 
106020 
106080 
106140 
106200 
106260 
106320 
106380 
106440 
106500 
106560 
106620 
106680 
106740 
106800 
106860 
106920 
106980 
107040 
107100 
107160 
107220 
107280 
107340 
107400 
107460 
107520 
107580 
107640 
107700 
107760 
107820 
107880 
107940 
108000 
108060 
108120 
108180 
108240 
108300 
108360 
108420 
108480 
108540 
108600 
108660 
108720 
108780 
108840 
108900 
108960 
109020 
109080 
109140 
109200 
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ttcatatgtt 
ttaagagtat 
acagttaaac 
tatcagctct 
ataaagagtt 
ccagtgagag 
aagtagctta 
aaatctaatt 
ggctttgttt 
agttctggct 
gcaggctcag 
tctaatcact 
gtcatttcca 
tgaaattaac 
tggtatcctc 
tagcatattc 
ctatccttat 
aatcagaaaa 
agctaagtta 
ctagtggcaa 
agacacagta 
cctgtttcaa 
tgcagaggag 
aatatctact 
cattggtgac 
tcttcacgct 
ccctcatcca 
tcctttgtgg 
tcagagaaga 
ttggtccacg 
tgcaaccctc 
gtgctttatg 
gccttctgtg 
ctggcttgtt 
ttcacttatc 
aggatctgct 
gccgttacta 
tccatggagc 
cccatgatct 
aaaagaaaat 
ttaccctatg 
tcctagaagt 
tcaattaaca 
gctctatagt 
aaatccttaa 
ctgcttctta 
caaaaagaaa 
cacagcttac 
cctctgccat 
ttttttttgt 
ccagtaatct 
cccagcctat 
tgtgtcttac 
catcccttgt 
ccagccaaaa 
catacctggt 
ttgaatgtta 
gcatgtctaa 
cacagatttg 
ggcctatagt 
agagactaat 
ttcggattat 
aggatgaaac 



ttcatcttat 
ttatattagt 
aaattagaag 
gccaaggttt 
actctcatct 
gcagaaaaga 
attttctttt 
ttagaatttt 
tttaaaattt 
ctgaggagca 
gctctgtcaa 
tcgctatgca 
tgctctaatt 
tagaagcatg 
ctcaaatatt 
tgtttgtcac 
aacaggaacc 
agaatgtcag 
tctaaacaat 
aatctgagca 
gaaatgaata 
ttaaatcact 
caatgcaaaa 
aatccaacat 
tgagttcatt 
gtttctcacc 
ggccaacgcc 
atctgtgctt 
aaagcatttc 
ttgagctcta 
tgctttatgg 
tgtatggagc 
gccccagtga 
ctgacaccta 
ctctccttat 
ctacagaagg 
ttttctattc 
aggggaaaat 
acagtctgag 
tgttttctaa 
atttttcata 
gttggggaat 
ggaattttga 
gaagaatcaa 
aaccagacct 
aacatatatt 
acagcatctc 
tgcaatcttg 
agtagttggg 
agagacggga 
acctgcctgg 
aataagcttt 
aaagaaagtc 
acactcagct 
acacaccaac 
ttatttcttg 
cctaaaagtc 
aaccacatca 
catgggatca 
tcccctagag 
gatgaggtgt 
tgaagggcta 
tagtccgttt 



atagcagtca 
tagataacgg 
tttatttctc 
taggagctcc 
atatgattca 
agaagggaag 
attacattct 
aaaaaaagga 
atttaactta 
aatacttaac 
atctgtctgt 
tgctcagtat 
caggtctgtc 
ttagtagaca 
tatcttttat 
ataaagaatg 
taagggattt 
gtaccaggtt 
taatttacaa 
ataaagatgt 
tgagtaagaa 
atctctagtg 
caatgtttca 
tctcatttcc 
ctcctgggac 
atttacatgg 
ccggctccac 
ctcttccaat 
ctatcctgcc 
catcctggct 
cagcagaatg 
actcactggc 
aattaatcac 
caacaaggag 
catcctcatt 
caggcacaaa 
agctcttttc 
ggtagctgta 
gaacaaagat 
ataaacatta 
gagcatagtc 
ttttatgaag 
attcaatata 
tgtaaatttt 
gggttttctt 
aagctttttt 
actccatcac 
atctctggtg 
actacaggtg 
tttcaccatg 
gcttcccaaa 
aagaaaaata 
aacattccac 
gtcatctcag 
actttggtcc 
gctttttaga 
tgtctaagga 
acccaaaagg 
ctgtcttcca 
gtaacctaga 
gtgcaaggtg 
caacctctag 
agacttggtt 
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agtttacaat 
attaaattcc 
atttatgtaa 
aactaattta 
ggatggctca 
ggcaaacctg 
cttggccaga 
aattctggtc 
tttattttta 
ctggttaagg 
tgtctggatg 
gcacaaatat 
gctccaaaaa 
ttttgaaaat 
tttcccccaa 
agctaaaatg 
tatctgacat 
ctcaacttac 
aagtgctgag 
gagataaaaa 
attaagtgac 
tctcctgaca 
agaaaatata 
cactgaagga 
tggccaatca 
tcacggtggc 
acgcccatgt 
gtgactccaa 
cgtcttgtgc 
gtgatggcct 
tccaagagcg 
ctgatggaga 
ttctactgtg 
gtgtcaatgt 
tcctatctct 
gctttttcta 
ttcatgtatc 
ttttatacca 
gtgaaagagg 
ctactgattt 
tgatacaaat 
tgtgaataca 
gttttgcctg 
aaaaaatcta 
tttgcaagga 
taaaaaaaac 
ccaggatgga 
ctcaaatcat 
tacaccatga 
ctgcccagac 
atttaggaat 
gcagttctgc 
tttaccatat 
ctgttgcctc 
acagtgaggc 
ggaaaaataa 
gatccaaggc 
tgtaacatgt 
gtcttgaaga 
acttaagata 
agaaattaga 
agtaggacag 
tcatggagga 



aatcttatta 
tatgagaaca 
aattctgagc 
ctcttctttc 
gcactatatc 
ttttatcaat 
agctaccctt 
atttttgttt 
agagctgtcc 
gaaaggtact 
tctggatgat 
agacagacat 
attacctact 
ttttctgtaa 
aagtagcagc 
tatttttagt 
ggcatggaag 
agttaagtta 
tattcatttg 
agcaaaatgt 
ataaaccgaa 
caaaacaaat 
aaaatgtttg 
aattatgaga 
ccgggaatta 
aggaaatctt 
actttttcct 
ggatgctgga 
agtgttacct 
ttgaccggta 
tgtgctcttt 
ctatgtggac 
tggacccacc 
ttgttgtggc 
acatatttcc 
cctgtggctc 
tcagacgtcc 
ctgtaatccc 
cattatgcaa 
ttgttgtgtt 
atatccaaaa 
agaaaaaatg 
cccatacccc 
atttatactt 
gttaggcata 
aacaacaaca 
cctgagtgca 
cctcccacct 
tacctgacta 
tggtcttgaa 
tacaggtgta 
taaactctac 
tcaacctaaa 
tgcctgtacc 
acacttatgt 
atggctagta 
taacagacaa 
ggttgtaatg 
tgtgtgtttt 
taatcacata 
acacagatgc 
agtcaacaac 
gttaaaaagt 



agcaccaacc 
aaacaaaata 
tagtatgtca 
tctgctttct 
cacattacag 
ggatggataa 
ttgccatagc 
tgttttcttt 
cataattatc 
agcttctagt 
cacttagcca 
gttggacaaa 
ctgaagttta 
aattttcaat 
tagtctaaat 
tttttactac 
aattaactgt 
gataacttat 
tttattagca 
aaaaaaaaat 
aagattagac 
ttagaaaaac 
ccctttagaa 
agaaactgta 
cagattttcc 
ggcatgattg 
gagcaactta 
gattttcctt 
ttttatcacc 
catggccatc 
cctcatcaca 
ctacaaccta 
actgattaag 
tggtttcaac 
tgccacccta 
ccatctgaca 
atcagaagag 
catgttgaat 
agaactgttc 
gtcattttat 
ttatatattt 
agtatttaca 
aaagtaagaa 
tagagaaaaa 
tagttttaga 
acagcaacaa 
gcggtgtgat 
cagcccctca 
atttttgtgt 
ttgctgagct 
agccactctg 
taaatacttt 
atatgagcct 
aacctcatct 
ttccacagtc 
gacttctaaa 
tctgttcaat 
atactatcat 
tgggtgtgga 
caaggaagcg 
cctcatgaaa 
agcaacaaca 
cttatttaag 



109260 
109320 
109380 
109440 
109500 
109560 
109620 
109680 
109740 
109800 
109860 
109920 
109980 
110040 
110100 
110160 
110220 
110280 
110340 
110400 
110460 
110520 
110580 
110640 
110700 
110760 
110820 
110880 
110940 
111000 
111060 
111120 
111180 
111240 
111300 
111360 
111420 
111480 
111540 
111600 
111660 
111720 
111780 
111840 
111900 
111960 
112020 
112080 
112140 
112200 
112260 
112320 
112380 
112440 
112500 
112560 
112620 
112680 
112740 
112800 
112860 
112920 
112980 
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aatttatgac 
gggctatcaa 
ttcatgacac 
agacttcaaa 
gaaataaaaa 
aaattgtaaa 
ccatgtatat 
gaaaccaata 
tctctgtgtt 
cacattgaaa 
ggagagtcag 
atcactacaa 
cactcatcca 
attcactttg 
aaaggcctta 
cagaaataaa 
cattgcatgt 
ccctttagga 
tgtgaggctc 
cagccctagc 
ttcagcatct 
aaccacaaca 
ttaagatatc 
ggccagatct 
gtcacttttc 
tttgtcataa 
agttttcagc 
gataatcatg 
gggtatgtta 
ctttttgatg 
gttcatcagg 
tatcaggatg 
ttggaatagt 
tgtgattcca 
ttcagaactt 
tgactgtata 
tcttcagcaa 
ccaataacag 
agagaataaa 
actacaaacc 
gctcatggat 
gattcaatgc 
ttgtaaattt 
agaactaagc 
caaaaacagc 
cctcagaaat 
agcaatagga 
tgcagaaaac 
ataaaagact 
accattcagg 
acagaagcca 
gaaactatca 
ccatccagaa 
aaaaagtggg 
aaacatgtga 
caatgagata 
gatgctggag 
tagttcaacc 
catttgaccc 
ataaagacac 
accaacacaa 
tggcacacta 
acctggaaac 



aataattaat 
atctcaagct 
tggaataagc 
gatagcccac 
gcacaaatga 
ataaagatat 
atgtgtgtac 
attatatata 
ttataggcat 
ggaagaggaa 
cctttctctt 
tcctgcatca 
ccaagttcag 
tcatctgcct 
ttcttttcaa 
atatttcagg 
ataagtctaa 
ctatctcctt 
tgatacaaag 
acatccttaa 
acaggttaac 
cagagtacag 
accctgtctt 
tccaatactg 
aaagggaatg 
atagctctta 
atgaaggggt 
tggattttgt 
agttagcctt 
tgctgttgga 
gatactggcc 
acgctggcct 
ttcagaagga 
tctggtcctg 
gttattgatc 
tttagaaaac 
agtcttagga 
acaaacagag 
atgcctagga 
accactcaag 
aggaagaatc 
tatccctgtc 
catatgaaac 
tggaagcatc 
atgatactgg 
gatgccacac 
taaaggattc 
tgaagctgga 
taaacgtaag 
acataggaac 
aaactggcaa 
tcagagtgaa 
tctataagga 
caaaagatat 
aaaaaacctc 
ccatctcacg 
aggacgtgga 
attgtggaag 
aacaatccca 
atgcacatgt 
atgcccatca 
tgcagacata 
catcattctc 



gctcatgtta 
aaataatgaa 
aaacataaac 
tgaaaaagtt 
atatctttaa 
ttaatcattg 
atatgtatgt 
atatgtagag 
tgaagcattt 
gaaacagcta 
cttcttcctt 
gtgagtcctc 
agaggtggcc 
gacattggcc 
aggattttgt 
attgacctgc 
ttgtgtgaca 
gtctcttcag 
agtaggcagg 
agcacccctg 
ctccatgggc 
ggaaaacagc 
cctatttgaa 
tgtttaatag 
tttccagctt 
ttattttgag 
gttgaatttc 
cattcgttct 
gcatcccagc 
tttggtttgc 
tgaaattttc 
cataaaatga 
atggtaccag 
ggtttttttt 
tatccaggga 
cccattgtct 
tacaaaatca 
agacaaatca 
atccaactta 
gaaataagag 
aatatcatga 
aagctaccat 
tgaaaaagaa 
atactacctg 
taccaaaaca 
atctacaacc 
cctgtttaat 
ccccttcctt 
acctaaagcc 
aggcaaagac 
atgggatctc 
caggcaacct 
acttaaacaa 
gaacagacac 
atcatcactg 
ccagttagaa 
gaaacaggaa 
acagtgtggt 
ttactgggtt 
atgttcattg 
accatagact 
aaagaggatg 
agcaaactaa 
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agtcagcatt 
gaacataaag 
ataagtcaga 
ctaattcaca 
tttatttcag 
aaaataaatg 
gtacatatat 
ttccaagcat 
ttagagtgga 
agtccctctc 
agtaaatttc 
tgaactggca 
atccgcattg 
ttgctataag 
gctgtcttca 
ctcaaatttt 
gaagtttccg 
ttcttcaatc 
tgctggcatc 
accagcaccc 
ttgagctgca 
ttttcaaaat 
taccctattt 
gagtggtgag 
ttgcccattc 
atacattcca 
atagaacgcc 
gtttatgtga 
gataaagcca 
cagtatttta 
tttttttgtt 
gttagggagg 
ctcctgtttg 
gattggtagg 
tttggcttct 
cagccccaaa 
atgtgcaaaa 
tgagtgaatt 
caagggatgt 
aggacacaaa 
aaatggtcaa 
tgactttctt 
cccatatagc 
acttcaaact 
catatataga 
atctgatctt 
aaatgatgtt 
atgccttaga 
acaaaaaccc 
ttcatgacta 
actaaactaa 
acaaaatggg 
atttacagga 
ttcacagaag 
gtcattagag 
tggcgatcat 
tgcttttaca 
gattcctcga 
acatacccaa 
cagcactatt 
ggataaagaa 
agttcatgtc 
cacaggaaca 



tatatttata 
aggcccaggt 
gaatgggtta 
gaaattttcc 
atattaggat 
tgaaaattta 
atatgcaaag 
atctgcaggc 
taggcaggta 
atgaggcttt 
atgcagacct 
ccatagttca 
ctaaacagaa 
atagcctcat 
agtgatattt 
agtttgtctt 
atactctgct 
aatttttcct 
cagcccagag 
tcaacctggc 
ccttcacaaa 
tggtaaggtt 
tttttttgcc 
agagggcatc 
agtatgatat 
tcaataccta 
tttttttctt 
tggattatgt 
acttgatcgt 
ttgaggatat 
gtgtctctgc 
agtccctctt 
tacctctggt 
ctattaatta 
tccttgtttt 
tctccttaag 
atcacaagca 
cccattcaca 
gaaggacctc 
cagatggaaa 
actgcccaaa 
cacagaatta 
caagacaatc 
atactacaag 
ccaatggaac 
tgacaaacct 
gggaaaactg 
caaaaattaa 
tiagaagaaaa 
aaacatcaaa 
agagcttctg 
agaaaatttt 
aataaacaaa 
aagacattta 
aaatgcacat 
gaaaaagtca 
ttgttagtgg 
ggatctagaa 
aagattataa 
cacaatagca 
aatatggcac 
ctatgcaggg 
gaaaacaaaa 



ctgctttgat 
tagaagaagc 
tcaaccactg 
aggaaaataa 
tacttgatgc 
aactctaaga 
actgattata 
tgaaggagag 
tataaatttt 
aagtggtctt 
tggcagccta 
ctaggaaagg 
taaaacaaaa 
aaactgcctg 
ttctccatgt 
ggttattggc 
tcatgaagta 
gaagttcttg 
ccaccctttc 
caacaggttt 
tcacaagaga 
agtagtggtg 
tgattgccct 
cttgtcttgt 
tggctgtgag 
gtttattgag 
catctattga 
ttattgactt 
ggtggataag 
ttgcatcgat 
caggttttga 
tttctgttat 
agaattcggc 
ttgccacaat 
ggagatgaca 
ctgatgaaca 
ttcctataca 
attgctacaa 
ttcaaggaga 
aacattccat 
gtaatttatt 
gaaaaatctc 
ctaagcaaaa 
gctacagtaa 
agaacagagg 
ggcaaaagca 
gctagccata 
ctcaagaggg 
cctaggcaat 
agcaatggca 
cacagcaaag 
tgcaatctat 
cgaccccatc 
tgtgggcaat 
caaaaaaaca 
ggaaacaaca 
gagtataaat 
ccggaaatac 
atattctact 
aagacttgaa 
atatacacca 
aaatggatga 
cactgcatgt 



113040 
113100 
113160 
113220 
113280 
113340 
113400 
113460 
113520 
113580 
113640 
113700 
113760 
113820 
113880 
113940 
114000 
114060 
114120 
114180 
114240 
114300 
114360 
114420 
114480 
114540 
114600 
114660 
114720 
114780 
114840 
114900 
114960 
115020 
115080 
115140 
115200 
115260 
115320 
115380 
115440 
115500 
115560 
115620 
115680 
115740 
115800 
115860 
115920 
115980 
116040 
116100 
116160 
116220 
116280 
116340 
116400 
116460 
116520 
116580 
116640 
116700 
116760 
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tgttctcact 
cacacaccag 
cttaatgtag 
taacaaacct 
atgtcacctt 
taaaaatgta 
agggaagtac 
tgctaaatga 
tgaatctctc 
taggcagcat 
agttacagca 
ttgtaatagc 
agtgcagatg 
caacataata 
ctgtataatt 
gcaaagatat 
gtactggtaa 
aaaaattcat 
ggccgaggca 
accctgtctc 
ccagctacta 
gagccgagat 
aaaaaaaaaa 
tgtaaattag 
agcaagcaaa 
tattagtttg 
atttatttcc 
tttcattctg 
tttatttttc 
gattagggca 
gtctccaaca 
gagggaggga 
agaagtttga 
ttaagttgaa 
gttatcttat 
gaaggaccag 
ctagagaaat 
tctgtgaaat 
ttgtcaccag 
taagtttctg 
gagttagata 
agaaaaatgt 
ggatggaagt 
aaatgaaagc 
acatacccca 
aatagaaaaa 
aagttctgac 
tacagtatgc 
gttattatcc 
ttaaaactta 
aaattagtgg 
tcagaaatat 
atcaaggcat 
tgtgcatttg 
aattattgca 
gcttgatata 
ctgtgtctgt 
aaacaattgc 
accactttga 
tgatctaggg 
gtaaaagtca 
ccgcagggac 
tctgaaatat 



tataagtggg 
ggcttgtcag 
attacaggtt 
gcaggttctg 
gttaaaaaac 
ttcaaagaac 
acattacaat 
caaaaagaaa 
acataatgtt 
aaatttaagc 
gtcagacatg 
cacatctgga 
aaggaattat 
tttagcacca 
tctattacag 
agttaatttg 
tcttctattt 
gaagcaggcc 
ggtgaatcac 
tactaaaaat 
gggaggctga 
cccaccactg 
aatcataatt 
acttcaatat 
gcttcattcc 
ctagggctgc 
tcacagttct 
agttctcttt 
tgtgtgtatc 
cagctcaccc 
gtcatgctct 
aggggcacaa 
cacaaagttt 
agtaacatgg 
ggattttcca 
agagttgtga 
tattcagtga 
cccttaggga 
aaaagccatt 
catgtgtttg 
tatgattgtg 
agacacttag 
taatggcata 
taatatttat 
ttttgtgaaa 
aattatagtc 
atcagaatgt 
tttatggttt 
atatctatta 
taaaatagat 
taaaactaaa 
ctaactccag 
tcatctttgt 
tcaaatacac 
taaacaacat 
taaaaatgaa 
gtagaaagga 
tttgctgaga 
cccaacctgg 
ctgtgcagga 
tcgccattct 
ctctgccctt 
ggtctgaaac 



tgtgaacaat 
gggttfcgggg 
gatgggtgca 
cacatgtatc 
agatagtaga 
caagtaacat 
tacaagatat 
aaaatgatat 
agtgagaata 
tggacatatt 
catgaatatg 
aataacctat 
ttacatattc 
atgtagtgtt 
aaaattcaaa 
aggttactga 
gagaatttag 
gggcacggtg 
gaggtcagga 
acaaaaaatt 
ggcaggagaa 
cactccagcc 
catgaagctg 
aaagcttact 
aatatcaaag 
cacaaataag 
agaggctaga 
ctggcttgtg 
tgtattctaa 
attaggcttt 
gtggtgcttg 
tccagtccat 
gtgacactag 
taaacctaag 
atgtaagaag 
gttcccattt 
gaaaaaaaat 
aataaagtca 
tgatattctt 
taatcctgga 
atgattaaat 
catttaaggt 
aaaatataag 
taagggtttg 
tccgaaaaag 
cagaattatt 
gaactcaacc 
gtagtggaat 
taggcaaaat 
tagtgtttgg 
acacagaata 
atcctgtgaa 
gtttggtcta 
aaagttgggc 
taaacattaa 
tgaaggtatg 
aagacataag 
tgttgttaat 
agctcacaaa 
catgccttgt 
ctagtcttga 
gaaagcggag 
caggggcaca 
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gagaacacat 
gctagggaag 
gcaaaccacc 
ccagaactta 
aaaaccagca 
atgactttgt 
ttgtgttttt 
tgctaagtac 
caaatggata 
tatactccat 
ttcatgaaga 
atataaagtt 
atcacactca 
cattggagac 
accagacaca 
ctaggaaagg 
aagcaagtaa 
gctcgtgcct 
ggttgagacc 
agccaggcgt 
tggcatgaac 
tgggcaacag 
tacacttacg 
aaaaacgaat 
cattctattt 
taccatgaac 
agtccaagat 
gatgatcatc 
tctcttctta 
attttactaa 
gagttagaac 
aacacagatt 
tacgagagaa 
gcaatgtgag 
gaaaatactg 
taaatttgtg 
ctgacagagt 
tcatacaaat 
tgtaaggata 
acacttgttg 
gactaggtag 
acttttattt 
aggcatgctc 
ccacacattg 
agtgcttttg 
aagaaacatg 
agtcgactcc 
ttctttctgt 
gtcatagaat 
cttataagaa 
aggaacatgt 
ttgattttat 
gtcctaaaac 
actttaaata 
acaaagaaac 
tagggaaaag 
agactccatt 
ttgtagcttt 
aacatgtgtt 
taacaaaatg 
taaaccagga 
tattgtccaa 
ttgcactgcg 



ggacacagga 
ggatagcact 
atagcacata 
aagtgtaata 
aaatgagaac 
gctcaacttt 
atatatctgt 
tggtgggact 
caaccagttt 
aagccagcac 
gatatgcaca 
tggtgtactt 
tcatcatgaa 
tagacctaac 
attaatctac 
gcatgaagac 
tacaagtatg 
gtaatcccag 
atcctggcca 
ggtggcgggc 
ccggaggcgg 
agcgacactc 
ttttgtgctc 
aaaaatagta 
acctatcagt 
ttggtgactt 
caaggtgtgg 
ttatcccgac 
taaggatgca 
atgttctctt 
ttcagcatat 
aagaaacgtg 
actgtatcag 
aatccatggc 
agaatgaaac 
ttgtgccaaa 
aattgcttca 
attataaatt 
gctcttccct 
tacaatcata 
aaagaaaaat 
gttaaagtct 
taggatcttt 
ggcacagtgc 
tatttgctaa 
acccagacta 
caaacgtatg 
actaaccatg 
tggtttgagg 
acaccatgta 
caaaagaaca 
gctaaggcta 
ttaagaatgt 
tatgaatttc 
aaaggtgaaa 
aaagagagat 
ttgaaaaaga 
gccccagcca 
gtatgaaatc 
tttacaagca 
gcacaatgca 
ggtttctccc 
gaaagccgta 



aggggaatat 
aggagaaata 
tatacctatg 
aaacaaatgt 
tggtgctata 
aataatcatt 
acctatatgt 
atggaacaat 
ggaaaactat 
ttctactcat 
agactgaatt 
attatacagc 
taaaaatcac 
aaaatatgca 
agtgttaaaa 
agatggtgga 
ttcactttgt 
gactttggga 
gcacagtgaa 
acccgtagtc 
agcttgcagt 
tgactcaaaa 
attatttgtg 
ctagtcttca 
acacagaggg 
aaacatgcag 
gcaaaattgg 
ctctttacac 
agttatattg 
tagatattgt 
gaattttgga 
aaaggctaat 
aaaagttgaa 
agacatgaat 
ataaggcaga 
tgtcatatct 
tttttgcata 
attcctgtat 
tattcataaa 
tgtattttca 
gccaattacc 
tgaataatga 
cactcaatat 
tatgcatatc 
tttcatctgt 
ctcagatcag 
tttctaccag 
agggaaatat 
gttgaatgtg 
agtgctggtt 
gagcagcatt 
tcatattttt 
caaccggatg 
actgtatata 
tctgacagga 
cagactgtta 
cctgtacttt 
ctttgaccca 
aaggtttaag 
gtatacattg 
ctgtggaaag 
catgtgatag 
gggacctctg 



116820 
116880 
116940 
117000 
117060 
117120 
117180 
117240 
117300 
117360 
117420 
117480 
117540 
117600 
117660 
117720 
117780 
117840 
117900 
117960 
118020 
118080 
118140 
118200 
118260 
118320 
118380 
118440 
118500 
118560 
118620 
118680 
118740 
118800 
118860 
118920 
118980 
119040 
119100 
119160 
119220 
119280 
119340 
119400 
119460 
119520 
119580 
119640 
119700 
119760 
119820 
119880 
119940 
120000 
120060 
120120 
120180 
120240 
120300 
120360 
120420 
120480 
120540 
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cccttgaaag 
tgtggcatga 

ggaggattag 

tgcctgcccc 
tgagatcaga 
ttgttattct 
gcatgtccag 
atgtttgttg 
aataatgaag 
atatgctgag 
ttatttcttt 
caggccaccc 
actcgatgtt 
tataagatgt 
ttaatatcag 
tgctaagcaa 
aaaagaccag 
aaccacacag 
gaattagatg 
gagaggctag 
ggaaactaga 
ggaaaatttc 
aaccctcgaa 
cctggaaaat 
ttgtttgctc 
agatcattca 
tctgccagaa 
gcaagtccac 
aaagaagaga 
agggacataa 
tatccctact 
gaaaagaggg 
tcatgttgct 
aattatgaga 
ccaggaatta 
agggaatctt 
ctttttcctg 
gatgctggag 
gtgttacctt 
tgactagtac 
gtgttccttc 
catgtggacc 
agacccacca 
tgttgtggct 
catttttcct 
ctgtggctcc 
cagacctcca 
tgtgatcccc 
ttatccaaag 
gtcatgctgt 
tatccaaaat 
gaaatatgag 
cataccccaa 
ttgtatttta 
aggcatatag 
agcatctcac 
caaccttgat 
tagctgggac 
cggggtttca 
cctgggcctc 
ggtttaaaaa 
aagtcagcat 
cagttgtcat 



cggggtattg 
gaaagacctg 
taaaagagga 
tggggactga 
taaaaactgc 
ttactccact 
tcatagtacc 
ctgaccttct 
ataataatta 
tgccggtccc 
tctcagtctt 
cttcaaaggt 
gtataaatca 
tcttgtgaat 
tgttaacagt 
aaccggaaca 
gcttagagaa 
aatcagtgtt 
aaaaaaaatt 
catctgaccc 
agtggaagat 
ctccaaaaaa 
acgcaaatat 
ctctattgca 
aggcttctcc 
gtcaatattg 
ttataatcag 
ctataaagta 
agaagaatag 
aacaagaaaa 
ttagtttacc 
gaaaactttt 
attgtattgc 
aggaacttca 
cagattctcc 
agcatgattg 
agccacttat 
attttccttt 
tatatcatct 
atggccatct 
ctcatcacgg 
tacaacctag 
ctgattaagc 
ggctggaatc 
gctatcttaa 
catctgacag 
tcagaagagt 
atgttaatcc 
aactgttcaa 
cattttattt 
tatacatttt 
tatttacatc 
agtaaaaatt 
gaaaataaaa 
ttttaaactg 
tccatcaccc 
ctcagggttt 
tacaggtgtg 
tcatgttgcc 
caaaagagtt 
agagcacttc 
tccactttac 
ctcagctgtt 



tccaaggttt 
accatccccc 
aagcctcttg 
atgtctcggt 
cctatggtgg 
gagatgtttg 
ttcccttgaa 
tattatcacc 
ataaaaactg 
ctgggcccat 
tcatcccacc 
acatcactac 
caaataaaat 
atagatgatt 
ttttgttgtt 
ttctgaaagg 
cagatagcag 
tctttgcaac 
ttgtcaaact 
ctgcttaagc 
caagaggaca 
ttcaagtgga 
gtagatattt 
ttgagtctgg 
aagaggggat 
ttctttgata 
gaataaagaa 
catttctcaa 
agacagaagc 
ataaagggca 
tgaaacaaag 
ggtaagaaga 
aatcaatata 
cgttggtgac 
tcttcatgct 
ccctcatcca 
ccttcctgga 
cagagaagaa 
tggtacacgt 
gaaaccctct 
tgccttatgt 
ccttctgtgg 
tggcttgttc 
tttcgttttc 
ggattcgctc 
ctgttactat 
ccatggagca 
catgatctac 
aagaaaattg 
agcctataat 
ccttgaaggg 
aattaacaga 
tctatagtga 
cccttaaaac 
cttcttaaac 
aggatggact 
caaaccatcc 
caccaccata 
cagactggtc 
tgcattacat 
tgctaaactc 
catgtccaac 
tcctctgcct 
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ctccccatgt 
agcccgacac 
cagttgagat 
ataaaacccg 
gaggtgagac 
ggtggagaga 
cttaattatg 
ctgccctcct 
agggaactca 
tgttgtttct 
caactagaaa 
tatgattgaa 
ctttgtttca 
tttatataga 
gttgtaagta 
atgttagggc 
cctaggcagg 
atcagcctga 
tgagtcatta 
ctttctaaac 
gaatttccat 
gtgttaatgg 
ttcaaaataa 
aggaagaaga 
tgaactggtg 
tcttgttctt 
attggttagt 
atataaagag 
tgaattagaa 
gtatcaaaaa 
ttcattagaa 
ttattactct 
tatgtaaagt 
tgagttcatt 
gtttctggcc 
ggccaatgcc 
tctgtgcttc 
aagcatttcc 
tgagatctac 
gctttatggc 
gtatggagcg 
ccccaacgaa 
tgacacctac 
tctcttcatc 
tacagagggc 
tttctatgca 
aggacaaatg 
agtctgagga 
tttcctaaat 
tttttcatag 
ttagggaatt 
tagtttgata 
ggaatcaatg 
cagacctggg 
atatattaag 
ggagtgcagt 
tcccacttac 
ccacactaat 
ttgaactgct 
gggtgagcca 
tactagatac 
ctaaaacgta 
gtaccaacct 



gatagtctga 
ccgtaaaggg 
agaggaaggc 
atagtacatt 
acgtttgcag 
aacataaatc 
acatagattc 
actacattcc 
gaggctggtg 
ctgtactttg 
tacccacagg 
tagatgtaga 
gatattcaaa 
aaaatagaag 
acagaacaca 
tcctaaatca 
tgtggaggct 
tcttcaaacc 
actcactccc 
agagaatctc 
gactctcaac 
ctggaattta 
ggggaacaaa 
aagcaacaag 
aaatctattc 
tcctctctct 
tttaaatgtt 
gggcagaata 
gggaatgaaa 
aggcagagag 
ataaaatgag 
actggtaata 
tcctttctcc 
ctcctgggac 
atttacatgg 
cggctccaca 
tcttccaatg 
tatcctgcct 
atcctggctg 
agcaaaatgt 
ctcactggcc 
attaatcact 
aacaaggagt 
atatttattt 
aggcaaaaag 
actctgttct 
gtagctgtac 
acaaggatgt 
aaacatcagt 
agcttagtgc 
tttatgaagt 
tcaatatagt 
taaataaaaa 
ttttcttttt 
ctattttttt 
ggcatgatca 
acccctcacc 
ttttgtattt 
gagtgcaagt 
ctgtgcccag 
ttttgcctat 
agcctcatcc 
catctccagc 



aatatggcct 
tctgtgatga 
cactgtctcc 
tgttcaattc 
caatgctgcc 
tggcttatgt 
tattgctcac 
tttttgctga 
caggtccttg 
tctctgtgtc 
tgtggagggg 
cacagctttt 
aataaccttc 
gtacatttca 
gctctggtta 
atgtgcggtt 
gagcagatca 
tcactgtaat 
ttaagacaag 
atctctcagt 
ttgacacatt 
atagctgtgc 
tcatgccctg 
tgagccccgc 
agaaataaca 
aaactcattc 
atgttttggg 
agagaaggag 
agaaagacaa 
ctgctttgaa 
gaaaactgct 
atagtaataa 
cactgaagga 
tgacgaatca 
tcacagtggc 
cgcccatgta 
tgaccccaaa 
gtcttgttca 
tgatggcctt 
ccaaaagtgt 
tgatggagac 
tctactgtgc 
tgtcaatgtt 
cctactttta 
ctttttctac 
tcatgtgtct 
tttataccac 
gaaaaaggct 
attgattttt 
aatacaaatt 
gtgaataaaa 
tttaactgcc 
aaaatctaat 
ggaaggagtt 
tttttgaaac 
cagcttacag 
cctgccacag 
tttgtagaga 
gatctatctg 
cctgtaataa 
cttacagagg 
cttgcacact 
ctaaaacaca 



120600 
120660 
120720 
120780 
120840 
120900 
120960 
121020 
121080 
121140 
121200 
121260 
121320 
121380 
121440 
121500 
121560 
121620 
121680 
121740 
121800 
121860 
121920 
121980 
122040 
122100 
122160 
122220 
122280 
122340 
122400 
122460 
122520 
122580 
122640 
122700 
122760 
122820 
122880 
122940 
123000 
123060 
123120 
123180 
123240 
123300 
123360 
123420 
123480 
123540 
123600 
123660 
123720 
123780 
123840 
123900 
123960 
124020 
124080 
124140 
124200 
124260 
124320 
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ccaatgcttt 
tcttggcttt 
aagtctgtct 
catcaaccca 
gtggaggcct 
aagacgagag 
aggaaattcg 
acaacaagga 
tttaagaatt 
ctttgatggg 
cacacctgaa 
tgagaccatc 
gggcgtggtg 
ctgaacttgg 
gtgacagaag 
aaaacaaaaa 
atgacactgg 
cttcaaagat 
ataaaaagca 
ttataaaata 
atatatgtgt 
aataattata 
tgttttatag 
ttatttattt 
gcagggtcat 
ctggttttcc 
gagattaggg 
aagcacatct 
gagagcacgg 
tcttagtaca 
caatctgatt 
tcgtcatcat 
cgggcagagg 
ccgggcgggg 
ggccgggtag 
tgccccccac 
tggacagggc 
cggaggggct 
cagatgtggc 
agacggtcct 
acggggtcgc 
ggctcctcac 
ggtggcggct 
gaggtggagg 
gcactgagtg 
atcactcgcg 
tacaaaaacc 
gcaggagaat 
cggctgggca 
gtggagggag 
tttagagtgg 
aagtccctct 
ttcctcctca 
tgagtcctct 
gaggtggcca 
acattggcct 
ggattttgtg 
ttgacccacc 
tgtgtgacag 
ctcttcagtt 
gtaggcagct 
gcacctctga 
tccatgggct 



ggtccacagt 
ttagaggaaa 
aaggagaccc 
aaaggtataa 
atagttcccc 
actaatgatg 
gattattgaa 
tgaaactagt 
tatgacaata 
ctatcaaatc 
atcccagcac 
tgcctaacac 
gcgggcgccg 
gaggcgcagc 
gagactctgt 
caagcaaaca 
aataagcaaa 
atcccgcaga 
caaatgaata 
aagatattta 
gtatatatat 
tataatatgt 
gcattgaagc 
ttttagtatt 
aggacaatag 
taggcagagg 
agtggtgatg 
tgcaccgccc 
ggttgggggt 
gaacaaaatg 
tctctttctt 
ggcccattct 
ggctcctcac 
tggctgctgg 
gggctgcccc 
ctccctcccg 
ggctgctggg 
cctcacttcc 
ggcggccatg 
cacctcccag 
ggccaagcag 
atcccagacg 
gggcagaggc 
ttgtagcaag 
agtgagactc 
gtcaggagct 
agtcaggtga 
caggcaggga 
tcagagggat 
agggagaggg 
ataggtaggt 
catgagactg 
gtaaattttc 
gaactggcac 
tccatgttgc 
tgctatgcaa 
ctgtctacaa 
tcaaatttcg 
aagttcccaa 
cttcaatcaa 
gctggcatcc 
ccagcaacct 
tgatcagcac 



gaggcacact 
aataaatgac 
aaggctaaca 
catgtggttg 
tagaggtaac 
aggtgtgtgc 
gggctgcaac 
ccgtttaggc 
gattaacggt 
tgaagctaaa 
tttgggaggc 
ggtgaaaccc 
gtagtcttag 
ttgcagtgag 
ctcaaaacaa 
aaaaacaaaa 
tataaacata 
taagtttcta 
ccttcaactt 
atcattgaaa 
gtatgtgtac 
agagttccga 
attttacatt 
tattgatcat 
tggagggaag 
accctgcggc 
actcttaacg 
ttaatccatt 
aaggttatag 
gagtctccta 
ttccccacat 
caatgagctg 
ttcccagacg 
gcgggggctg 
ccacctccct 
gacggggcag 
tggagacacg 
cagacgtggc 
cggaggagct 
acggggtggc 
aggcgctcct 
atgggcggcc 
tgcaatctgg 
ccgagatgac 
cgtctgcaat 
ggagaccagc 
gcggggcatg 
ggttgcagtg 
accgtggaga 
agaccgtgga 
atataaattt 
tcttggagag 
tacagacctt 
catagttcac 
taaacagaat 
tagcatcata 
gtggtatttt 
gtttgtcttg 
tgctctgctt 
ctttttcctg 
agcccagagc 
caatctggcc 
cttcacaaat 
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tatgtttcca 
cagtaggatt 
gataatctgt 
taatgatctt 
ctagaactta 
aaggtgagaa 
ctctagagta 
tttgtttcat 
catattaatt 
aaaacatgaa 
ccaggcggtt 
cgtctctact 
ctactcagga 
tggagattga 
acataaacaa 
cataaagagg 
agtcagagaa 
attgacagac 
attacagata 
taagtgtgaa 
atatatatgc 
gcatatctgc 
tttttttatg 
tcttgggtgt 
gtcagcagat 
cttccgcagt 
agcatgctgc 
taaccctgag 
attaacagca 
tgtctacttc 
ttccccctta 
ttgggtacac 
ggggggccgg 
ccccccacct 
cccggacggg 
ctggccgggc 
cctcacttcc 
ggctgccggg 
cctgacttct 
ggtcgggcag 
cacttcccag 
aggcagagac 
gcactttggg 
gccactgcac 
cccggcacct 
ccggccaaca 
cctgcaatcc 
agccgagatg 
gagagggaga 
aggagaggga 
tcacattgaa 
ggttacagca 
ggcagcctaa 
tgggaaaggc 
aaaacaaaaa 
aactgcctga 
tctccacctc 
gttatcggcc 
catgaagtac 
aagttctcgt 
cacctcttcc 
aacacttttt 
cagaaaagaa 



cagtccatat 
ctaaattgaa 
tcaatgcatg 
gaagatgtgt 
agatataatc 
attagaacag 
ggacagagtc 
ggaggagtta 
cagcgtttat 
ataggcccgg 
ggatcacgag 
aagagtacaa 
gactgaggca 
gccactgcac 
aaacaaaaac 
cccaggttag 
tgaattatca 
attttccagg 
ttaggattac 
aattaaaatt 
aaagactgat 
aggccgaagg 
ttctcagcac 
ttctcgaaga 
aaacatgtga 
gtttgtgtcc 
cttcaagcat 
tggacacagc 
tcccaaggca 
tttctacaca 
tctatttgac 
ctcccagacg 
gcagaggcgc 
ccctcccgga 
gcagctggcc 
aggggctgac 
cggatggggc 
cggaggggct 
caggcagggc 
agacactcct 
actgggtggc 
gctcctcact 
aggccaaggc 
tccagcctgg 

cggggggctg 
cggcgaaacc 
ccggcactcg 
gcggcagtac 
gggagaggaa 
gagggagagc 
aggaagagga 
gagaggctgc 
tctctacagt 
attcatccat 
ttcactttgt 
aaggtcttat 
agaaataaaa 
actgcttata 
cctttaggac 
gtggggatct 
aggcccagga 
tcagcatata 
accacaaccc 



ctggtttatt 
tgttacctaa 
tctaaaacca 
gtttttgggt 
acatacaagg 
agatgccctc 
aacaacagca 
aaaagtctta 
atttacactg 
cgtggtggct 
gtcaggagat 
aaaatcagca 
gaagaatggc 
tccagcctgg 
aagcaaacaa 
aagaaacctc 
actactcaga 
aaaataagaa 
ttgatgcaaa 
cgaagaccat 
tatataaagc 
agagactctg 
ctttatttat 
gggggatttg 
acaagggtct 
ctgggtactt 
ctgtttaaca 
acatgtttca 
gaagaatttt 
gacacagtaa 
aaaactgcca 
gggtggcggc 
cccccacctc 
tggggcggct 
gggcgggggc 
ccccacctcc 
ggctgccagg 
cctcacttct 
agccgggcag 
cagttcccag 
cgagcagagg 
tcccagacag 
aggcagctgg 
gtaacattga 
aggcgggcag 
caccaaaaaa 
gcaggctgag 
agtccagcct 
gagggagacc 
attgaagcat 
agaaacagct 
ctttctctcc 
cctgcagcag 
caagttcaga 
catctgcctg 
tattttcaaa 
tattttagga 
taagtctaat 
tacctcttgt 
gatataaaca 
catccttaaa 
caggtaaatc 
agagtaaaga 



124380 
124440 
124500 
124560 
124620 
124680 
124740 
124800 
124860 
124920 
124980 
125040 
125100 
125160 
125220 
125280 
125340 
125400 
125460 
125520 
125580 
125640 
125700 
125760 
125820 
125880 
125940 
126000 
126060 
126120 
126180 
126240 
126300 
126360 
126420 
126480 
126540 
126600 
126660 
126720 
126780 
126840 
126900 
126960 
127020 
127080 
127140 
127200 
127260 
127320 
127380 
127440 
127500 
127560 
127620 
127680 
127740 
127800 
127860 
127920 
127980 
128040 
128100 
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aaaaacaacc 
aaacaaacaa 
atatttaaag 
tgtacattac 
ttgcagaatg 
cctctcatat 
cagcataaat 
ctaacaggca 
aacagccaca 
tggtgtactt 
tcattttgaa 
ctagacctaa 
aattaatgta 
ggcaagcaga 
atacaagtat 
tttctgtgtg 
agacttcaag 
acagagggta 
acatgcagat 
aaaactggtt 
ctttacactt 
tcatattgga 
cgatatcgtg 
aattttggag 
aatgctaata 
aaagttgaat 
gtcaggaatg 
taaggcagag 
gtcatctctc 
ttttgcatat 
ttcctgtatt 
ttcataggta 
atgtattttc 
tgccaattac 
ttgattaatg 
tcactcaata 
ctacgcatat 
atttcatcag 
actcagatca 
gtttctacca 
gagggaaata 
ggttgaatgt 
aagtgctggt 
agagcagcat 
attgcatttt 
tcaaccagat 
cactgtatat 
aggagcttga 
acacagcttt 
aaataacctt 
ggtacatttc 
agctctggtt 
aatgtgcggt 
tgagcagatc 
ttcactgtaa 
gacaagcagg 
tttttttttt 
gagtgcagtg 
cctgcctcag 
tactcctttc 
ggagagaatt 
agagagtgtt 
gacacttttc 



tttcaaagtt 
actgatagca 
aaccaagtaa 
aagatatctg 
acagaaaaac 
aatgctagtg 
ttaagctgca 
gaaatacatg 
cctggaagta 
actatacagc 
taaaaatcac 
cagaatatgc 
cagtgttaaa 
gagatagtgc 
gttcactttg 
taaattagac 
caagtaaagc 
ttagtttgct 
ttatttcctc 
tcattctgag 
tctttttctg 
ttagggcaca 
tctccaacag 
agtgagggaa 
gaattttgac 
taagttgaaa 
ttattgtatg 
aaggaccaga 
tagagaaatt 
ctgtgaaatc 
tgtcaccaga 
agtttctgca 
agagttagat 
cagaaaaatg 
aggatggaag 
taaatgaaag 
tacatactcc 
taatagaaaa 
gaagttctga 
gtacagtatg 
tgctattatc 
gttaaaactt 
taaattagtt 
ttcagaaata 
tatcaaagca 
gtgtgcattt 
aaattattgc 
tatataaaaa 
tactcaatgt 
ctataagttg 
attaatatca 
atgctaagca 
taaaagacca 
aaaccacaca 
tcaattagat 
gactaacatc 
tttttttttt 
gcaccgtctt 
cctcctgagt 
taaacagaga 
tccatgagtc 
actggctaga 
aaaataaggg 



agtaaggcta 
gaaaaaaatg 
cttatgactt 
tatctttata 
tgatattgtt 
ggaatgcaaa 
catatttata 
aatatgttcg 
tcctatatgt 
cgtggagatg 
caacataata 
actgtataat 
agtgaagatg 
agtactgtta 
taaaaattca 
ttcaatataa 
ttcattccaa 
agggctgcca 
acagttctag 
ttctttttct 
tgtgtatctg 
gctcacccgc 
tcacgctctg 
gaggcacaat 
acaaagtttg 
gtaacatggt 
gattttctaa 
gagttgtgag 
attcagtgag 
ccttagggaa 
aaagccattg 
tgtgttttta 
atatgattgt 
tagacagtta 
ttaatggcat 
ctaatattta 
attttgtgaa 
aaactatagc 
catcacagtg 
ctttatggtt 
catacctatt 
ataaaataga 
gtaaaactaa 
tctaactcca 
ttcatctttt 
gtcaaacaca 
aaaaacaaca 
tgaatgaggg 
tgtataaatc 
ttcttgtgaa 
gtgttaacag 
aaaccggaac 
ggctcagaga 
gaatcagtgt 
gaaaaaatat 
tgacccctgc 
tttttttttg 
ggctcactgc 
agctgggact 
atctggtctc 
ttaactccac 
gaatttaata 
gaaaaaaatc 
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ttggtggtgt 
gcaaaatatg 
tgtgctcaac 
tatctgtacc 
aagcattgga 
tggatataaa 
ctccataaac 
tgaagagata 
gtatcaatag 
aaggaattat 
tttagcacca 
ttctattatg 
tagttaattt 
atgttctact 
tgaagctgta 
aggttactaa 
tatcaaagca 
caaataagta 
aggctagaag 
atcttgtgga 
tattctaatc 
tagaccttat 
tggggcttgg 
ccagtccata 
tgacactggt 
aaacctaagg 
tgtaagaagg 
ttcccatttt 
aaaaaaaatc 
ataaagtcat 
atattctttg 
atcctggaac 
gatgattaaa 
gtatttaaga 
aaaaatataa 
ttaagggttt 
atcctaaaaa 
ccagagttat 
tgatctcaac 
cgtagtggaa 
acaggcagag 
ttagcgtttg 
aacacagaat 
gatcctgtga 
tgtttggttg 
caaagttggg 
ttaaacaaag 
tacatcacta 
acaaataaaa 
tatagatgat 
tttttgttgt 
attctgaaag 
acagatagca 
ttctttgcaa 
tttgcaacct 
ctactccttt 
agacagcatc 
aagctccgcc 
acaggcgcct 
tcagtggaaa 
ataatggaaa 
gctgtgcaac 
atgccctgcc 



taaaatatca 
aactggtgct 
ctgaataata 
tatatatctg 
gagactgtgg 
cagtatggaa 
caacactact 
tgcacaagat 
taggaaggat 
ttacatattc 
atgtagtgtt 
gaaaactcaa 
gaggttagtg 
tgaggattta 
cgcctacatt 
aaacgaataa 
ttctatttac 
ccatgaactt 
tccaagatca 
tgatcatctt 
tcttcttata 
cttacttaaa 
agctggaact 
acacagatta 
aggagagaaa 
caatgtgaga 
aaaatgctga 
aaatttgtgt 
taacagagta 
catacaaata 
taaggacagc 
tctacttgct 
tgactaggta 
tacttttatt 
gaggcatgct 
accacacatt 
tagcactttt 
taagaaatat 
cagttgactc 
tttccttctg 
tgtcatagaa 
gattataaga 
aaggaacatg 
actgatttta 
agtcctaaaa 
cactttaaat 
aaacaaaggt 
ctatgattga 
tctttgtttc 
ttttatatag 
tgttgtaagt 
gatgttaggg 
gcctaggcag 
catcagcctg 
tgagtcatga 
atggtttttt 
tggctctgtc 
tcccgggttc 
gccaccacac 
ctagaagtgg 
atttcctcca 
cctcaaaagg 
tggagaatcc 



cccggttaaa 
atataaaaac 
attagggaag 
tgtctgtatg 
aacaactgac 
aactactagg 
gcccaaagta 
tgaattttgt 
ttattaattt 
atgacactcc 
cattggaagg 
aagcagacac 
actaggaaag 
gaggcaagta 
ttgtgcacac 
aaatagtact 
ccatcagtac 
ggtgacttaa 
aggtgtgggc 
atcccgacct 
aggatgcaag 
tgttctcttt 
tcagcataag 
agaaatgtaa 
ctgtatcaga 
atccatggca 
gactgaaaca 
tgtgccaaat 
attgcttcat 
ttataaatta 
tcttccctta 
atacaatcgt 
gaaagaaaaa 
tgttaaagtt 
ctaggatctt 
gggcacagtg 
gtgtttgtta 
gacccagact 
caaagcacat 
tactaaccat 
tcggtttgag 
aacaccatgt 
tcaaaagaat 
tgctaagcct 
cttaagaatg 
atgtgaattt 
gaaatctgac 
atagatgtag 
agatattcaa 
aaaaatagaa 
aacagaacac 
ctcctaaatc 
gtgtggaggc 
atcttcaaac 
acttacttaa 
tttttttttt 
gtccaggctg 
acgccattct 
ctgccctgac 
aagatcaaga 
aaaaattcaa 
caaatatgca 
ctattgcact 



128160 
128220 
128280 
128340 
128400 
128460 
128520 
128580 
128640 
128700 
128760 
128820 
128880 
128940 
129000 
129060 
129120 
129180 
129240 
129300 
129360 
129420 
129480 
129540 
129600 
129660 
129720 
129780 
129840 
129900 
129960 
130020 
130080 
130140 
130200 
130260 
130320 
130380 
130440 
130500 
130560 
130620 
130680 
130740 
130800 
130860 
130920 
130980 
131040 
131100 
131160 
131220 
131280 
131340 
131400 
131460 
131520 
131580 
131640 
131700 
131760 
131820 
131880 
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gagcctggag 
gaagggatcg 
ttttgatact 
aaaaagaaat 
tttctcaaat 
aaacagaagc 
taaagggcag 
gaaacaaagt 
taaaagatca 
atgtaaagtt 
gagttcattc 
tttctggcca 
gccaacgcct 
ctgtgcttct 
agcatttcct 
gagatctaca 
ctttatggca 
ta*tggagcgc 
cccaatgaaa 
gacacctaca 
ctcttcatca 
acagagggca 
ttctatgcaa 
ggtaaaatgg 
agccttagaa 
ttttcttaaa 
ctgagaaata 
ttttattatt 
gatctcggct 
ccgagtagct 
agagacaggg 
atgaaacacc 
gttcacaaaa 
tctttgtctt 
catgtggcta 
cttaattgat 
ccagcaagac 
actcacaatg 
tggagtttgc 
tacaaattca 
aatcacattt 
atgtgtgttc 
tatgtagttt 
accttaccac 
attttaaatt 
tgacatctgt 
cctacaatga 
ttaatcagta 
ctataatact 
tgattgtaat 
gttttgctct 
cacctttata 
ttctctgtcc 
tccaatcatt 
aactggagat 
ttaatgaact 
catgtttgta 
atgagctaca 
gtcagagcca 
tgaattacag 
aaagtaacat 
gagagtgaca 
gtaaagaaga 



gaataagaaa 
gattggtgaa 
tgtccttttc 
tgcttagttc 
aaaaggagag 
tgaatttgaa 
tatcataaaa 
tcagtagaaa 
ttactgtaat 
cctttcttcc 
tcctgggact 
tttacatggt 
ggctccacat 
cttccaatgt 
atcctgcctg 
tcctggctgt 
gcagaatgtc 
tcactggcct 
ttaatcactt 
acaaggagtt 
tatgtatttc 
ggcaaaaagc 
cccttttctt 
tagctgtatt 
ataaaaatgt 
aatcagtatt 
tagtgcatca 
atattttgag 
cactgcaacc 
gggaacacaa 
tttcaccatg 
gcgcccagtc 
gcttttatgt 
ccgtgcacag 
tttctggaag 
tgttagarca 
ttaatatagt 
gcaatacaaa 
ttgtttccct 
cacaggctga 
gagattttgt 
agttaaaatt 
taaatgtcaa 
taactcccca 
atatgcttgt 
ttattaaaga 
tcacttcttt 
tctactgtta 
ttgagcatca 
tctcttcttg 
tctttttgta 
gggtcaaaca 
tcccattgga 
ggtttaaggc 
tcaacaaaaa 
tttctgctgt 
tttttctgca 
tggggttccc 
tgggcaaaca 
tatatttaat 
ttatcatcaa 
tgaacaaaaa 
gttttcgttg 



gcaacaaatg 
aactattcac 
tctctttacg 
tgaatgttat 
gcagaataag 
gggagtgaaa 
ggtagagagc 
tagaatgagg 
agtaataatc 
accgaaggaa 
gaccagtcgc 
cacggtggca 
gcccatgtac 
gactccaaag 
tcttgtgcag 
gatggccttt 
caagagtgtg 

ctactgtgcg 
gtcaatgttt 
ctacctttac 
tttttctacc 
catgtatctc 
ttataccaca 
aaaagaagca 
cttttggttt 
atggagaaca 
atggagtttc 
tttgcctccc 
gcgcacacca 
ctggccaggc 
taaaaacttt 
tttaagttgt 
aatggctttg 
atgagattaa 
tttttgagtg 
ggctgcccag 
acggctgata 
agcacaaaat 
attatataat 
tttgtgttct 
atttattata 
ataatattac 
tagataatac 
attcctctgt 
taaggtgcca 
ctcttcattg 
aaattactat 
ctaacactag 
cacagttact 
tagtagctac 
caatagatga 
ttctgctctg 
atttcattca 
taagacagaa 
tatcatgagt 
atgtcagaat 
gaggaggcaa 
ggctcaagtc 
cagtctataa 
gaataaaggg 
gaatagccta 
tgggcaaagc 



43 

agcccaacct 
aaagaacaag 
cgcattctct 
gttttggggc 
gggaggagaa 
agaaagaaaa 
tgctttgaat 
aaaactgcca 
atgttgctat 
attatgagaa 
cgggaattac 
gggaaccttg 
tttttcctga 
atgctggaga 
tgttaccttt 
gaccggtaca 
tgctccttcc 
atgtggacct 
gacccaccac 
attgtggctg 
attttccctg 
tgtggctccc 
agacccccct 
gtaatcccta 
ttaatcaaag 
ctaaagccct 
ttgcagtttt 
tctctgtctt 
gggttcaagc 
ccatgcccga 
tcggcctccc 
attttctaaa 
cattcatctt 
tacctccatg 
aattcacata 
gatctgttgt 
tcaggagaca 
aataacgttt 
agcctattct 
atagaatgag 
tgtagtcatt 
attaccattt 
aagccatata 
tttcttgtta 
actgattttt 
tcattctttc 
tcacagtagg 
gaatgtatca 
tattaattat 
ggattttcat 
ttttccccaa 
tctatagttt 
tttgcagagc 
ttcaacttga 
aaaagtctta 
ggcacaaggt 
tttcttgatg 
ttctccttaa 
attcaacaag 
atgtaataga 
actggaaatg 
ctcaggccct 
cttcagtggc 



gtttgctcag 
atcctcgggc 
gcccagtatt 
aagtccacct 
acgagagaag 
tggaaagaaa 
atccctaatt 
aaaagagggg 
tgtattgtaa 
gaaactgcac 
aaattctcct 
gcatgattgt 
gccacttatc 
ttttcctttc 
ttatcgcctt 
tggccatctg 
tcatcacggt 
acaacctagc 
tgattaagct 
gctggaacct 
ctattttaaa 
atctgacagc 
caaaggaatc 
tgctgaacct 
agctgtcaat 
tcctagactt 
caaaacttta 
aggctggagt 
aattctcctg 
ctcatttttt 
aaagtgctgg 
attcaaatac 
gttcagcagt 
cccttaggtt 
tgtcacttct 
ttttattttt 
tggaacactg 
tcataagcta 
gattgatgcc 
aacatcacaa 
tttaatgctt 
gaacctactc 
taaactcctg 
tttcctttgg 
caaatatgca 
accaatcact 
atcacatttg 
ttaataatac 
taataatatg 
aaagttgaca 
actgtcacta 
ccatattttt 
tgttatttga 
ttacatgaca 
aagagagttt 
cagagataac 
ctatttcaat 
aacttcctgt 
tcagtcagta 
ttacacattg 
ggttaaggaa 
ggtggtccat 
aaatgcaaga 



gcttctccag 
aaatattgct 
ataatcggga 
ataaagtaca 
aagaatagaa 
acaagaaaaa 
tagtttacct 
acgtttttgg 
tccaatatat 
gttggtgact 
cttcacgctg 
cctcatccag 
cttcgtggat 
agagaagaaa 
ggtccatgtt 
caaccctctg 
gccttatgtg 
cttctgtggc 
ggcttgttct 
ttctttttct 
gattcgctct 
tgtcactata 
tgttgaacag 
tataatttat 
gaagatatac 
ttttctttag 
tttattttta 
gcagtggtgt 
cctcagcctc 
gtattttagt 
gattacacgc 
gtacaatttt 
tattttaagt 
taggtaagat 
cagttgaata 
tccatggtga 
ttctcatccg 
ctatgatatg 
atattaatta 
accttgaaga 
ttacatacac 
atgattaatg 
tctcacttct 
aaattaaccc 
cataatttat 
tcattcttac 
aatttttggc 
ctgaagtata 
aatactacta 
atggcctcca 
aactgtaaag 
ttctcagacg 
agataatgat 
atatgcaaca 
tatggaacag 
taggtccatg 
cataaaatcc 
tcactcagta 
ttgcagtcca 
tacagcaaac 
ccagtccact 
caataacctt 
tgcttatctc 



131940 
132000 
132060 
132120 
132180 
132240 
132300 
132360 
132420 
132480 
132540 
132600 
132660 
132720 
132780 
132840 
132900 
132960 
133020 
133080 
133140 
133200 
133260 
133320 
133380 
133440 
133500 
133560 
133620 
133680 
133740 
133800 
133860 
133920 
133980 
134040 
134100 
134160 
134220 
134280 
134340 
134400 
134460 
134520 
134580 
134640 
134700 
134760 
134820 
134880 
134940 
135000 
135060 
135120 
135180 
135240 
135300 
135360 
135420 
135480 
135540 
135600 
135660 
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aagtgataac 
tttatggaca 
ttatctggtt 
tgcctttaaa 
ctctatacta 
agccttagct 
ctgaggtggg 
accctgtttc 
cagctatttg 
gagccgagac 
aaaatagact 
gtgtagtgtc 
aagggttgta 
aagttggaaa 
atgaccaaat 
tgaatattca 
cccatgttcc 
accaaaaggt 
gagccactag 
aaaccgcaaa 
ctcacaaggg 
ggaggggaga 
aaagtttttc 
gattttattt 
tattattgca 
attttagcaa 
taagatagtg 
ccttccactt 
tcgcttctcc 
accatgttca 
ctgagcaaga 
atcctattct 
ttaagggaaa 
tcataattat 
agttttgatc 
tttaaaatga 
atgatttaag 
ggtgggaatc 
acattcttgt 
ctgatgataa 
tttgtaatta 
aaacttagag 
agttattgtg 
caagtatgct 
tgatcaaaac 
cctgtcagtc 
tcatcagtct 
ttccagaaat 
tgcaaatccc 
attgaagttt 
gaaatatggt 
agttaagtgt 
cgttggataa 
aacatcttga 
cactgtgatt 
tgctgatgcc 
gtcatgaaaa 
tttcagtatc 
tgtgtttaaa 
aaatctgtag 
gaaacatcct 
gtttcaagta 
ccaatcgcca 



aagatggtgt 
cagagtcctc 
ggatgcagtc 
aatgtaagat 
cgattcagcc 
gaagagctgg 
tgaatcacca 
tactaaaaat 
ggaggctgag 
cgcactgttg 
tagctgaaga 
aactgaggaa 
gcctgtaggg 
gagacacttc 
atattacaca 
caaagaagag 
ctttggagtg 
gaatcagaag 
gttattggtg 
aagagaagtc 
caggtttctg 
taacgaggca 
tgaggtttcc 
gtatttctca 
ttcataacat 
gttctcaaag 
ttctataaag 
tctacctttg 
catcctaaca 
agttctttat 
attggagaaa 
tcatagcaga 
tatttgtgag 
ccatagcatt 
tttatagcaa 
acagaattac 
gtgcaggaaa 
agacaacagg 
gcatcacttt 
aacattctct 
ttcagaactg 
aagttccctc 
aatatcaaat 
gagttccagt 
aataagttaa 
actttcatca 
cttcctgttg 
tatttgtacc 
tttgttgaag 
gtacttagta 
agtgctcttg 
cctcagacta 
caggtgattg 
ttcaagataa 
tcacttctaa 
agtgatgtgg 
agacaccctt 
taataatgtc 
taggtagttt 
ggagcttact 
ggtggaattt 
atattatttt 
caagaaaaag 



cttttaaggt 
tagtaagaac 
tttcttgatt 
gaagtcattt 
ccaggttccc 
gtgcagtggc 
gaggtcagga 
acaaaaatta 
gcaggagcat 
ccctccagcc 
attatctctt 
tgatgagatt 
tgaccgttct 
gacagtatga 
tatatgataa 
gcacaggcat 
gggacttaac 
acaccaagac 
gtctcttatc 
cagtgtcagg 
tttaaccctt 
tgtctgatct 
ttgaccaaga 
ttaggtacca 
ggttctataa 
gaaaggacca 
ggaggatgcc 
tctgtaatta 
tatattcaag 
cataaatggc 
gtaatttcat 
gagacagata 
gaataaatct 
tgggaaacaa 
ggttataatg 
actaggctgt 
caaaagggac 
tatatattga 
ttcatagcca 
tcaagttgaa 
agcctgaatc 
tccatatcag 
attattttta 
gtggactcaa 
atatgttagt 
ttggcagtca 
tattgtttca 
tagtattatt 
aaatccttta 
catattcttg 
agatgggaag 
ttttaaattt 
ctttgagctc 
aatagtacaa 
ttacctcttc 
gtatatatct 
attttcacag 
tgtattcact 
tgaggcaaat 
tagcttcatc 
catagtatgg 
tgtgttttcc 
gcaagttgaa 



44 

ggctgtttca 
tgatagtgga 
aggcaaaaca 
tctaagatgg 
ttctacactt 
tcatgtttgt 
gttcgccact 
gctgagcatg 
tgcttggacc 
tgggcaacaa 
cctgcaaggt 
cataaattta 
gacaggctgt 
agagtaagac 
tacatattca 
gaatagtagg 
atttaaatgt 
cctctgtgca 
aagaaggaat 
tggtttgcag 
attgtaggaa 
cccatctgtc 
ggcagtctgt 
catgatgatt 
ttatatgcat 
catctttttg 
catttttttt 
tggccaatta 
gcagaacaat 
ccactgaaag 
tggcagcaga 
acaataaatg 
ttgtagcaat 
ctgacaattt 
gaaaaatcaa 
ctgggacaga 
ctcattgttg 
gacaacgaaa 
accgtcctaa 
acagaataca 
ctagttttat 
tttattcaaa 
atatacattt 
tttcatgaaa 
tattcattta 
ccagtctctt 
tctatctgtt 
tgtttacata 
aaaatatttt 
tgtgaacttg 
gacaattacc 
gtttttcagt 
tactgaagta 
ccaattagta 
ttacacattt 
tccagcactg 
gtaagaaagg 
ttgttaaatt 
ttgcacagac 
tttcactttt 
ctatttcagc 
atttgcctaa 
aataatttaa 



agctgctgaa 
agagtgactt 
tctggccctt 
agtaacttat 
tcctttaccc 
aatcccagca 
agcctggcca 
ctggtggaca 
caggagacag 
gagtgaaatt 
ttctttaaca 
gaaaggtgga 
gaagcatatc 

sgggatttat 

acaggctata 
ctaatataag 
gtcatgatta 
cagcctctgt 
gctggtcaat 
atatcagtgg 
gcctaatggt 
atggcaggaa 
tcaattgatt 
cacccacaga 
gtaaatctgt 
tttttatatt 
ttgaaactgt 
cagatttctc 
agatcattta 
cccagcaacg 
caggaaagat 
ctgaactaca 
gtattttcct 
ttatcacctt 
ccactgtgta 
ggcaagggaa 
ctcagacaga 
tatccaatcc 
gatttatgcc 
gttgtggaaa 
tactattttg 
tgcaaaaccc 
tcctcaattg 
gttttttaac 
tttaacatat 
ctacctgttt 
tctaaaacat 
tcttaggcat 
tattattcag 
ataaaggaca 
tgaattcagg 
attgtataaa 
taaaatatag 
tttcccttga 
ggggattctc 
gccagaattt 
aaatgttttt 
gtgtacattt 
aatgcatttt 
ttgatattac 
agatgttttc 
tgtcttggtg 
gtctacatag 



atcctgctct 
tgttatgtcc 
gttggcatga 
atcaatgggg 
ttaccatttc 
ctttgggagg 
acatggtaaa 
cctgtaatcc 
aggttgcagt 
ctatctcaag 
cattcaggat 
ctttctcata 
ctccagctag 
gctgaatggg 
gaaaaaacta 
caacatgcat 
gactctatgc 
tgactggcaa 
tgctgtgttg 
tggtgcgagt 
gtttagcaag 
ctcagatttt 
caggggttag 
ttcacaatta 
tccttccact 
tttaccacct 
gagaacaatc 
tccatgatac 
gtttaagaaa 
tgaatcataa 
cacatactac 
gtaaaagatg 
tgatatgcaa 
tataaatgtt 
ataaaattat 
gggctgagtc 
aaagaggttt 
ttgaaaaagt 
atgtgataag 
aatatttggc 
ttagctgggt 
catttcatgg 
cattttgagg 
atgggaaaca 
gattattgtc 
cattgtttct 
ttcatgtttt 
cccatcaaaa 
attaaatagt 
aacaatggag 
tttcatgaga 
gtgctaccct 
atttttttct 
gaatgttctc 
ctgagatata 
gacctttggt 
gcatattatt 
tagacagcaa 
ccatgaaaga 
cagtgttatt 
actgttatta 
cctactgtca 
cttaattaat 



135720 
135780 
135840 
135900 
135960 
136020 
136080 
136140 
136200 
136260 
136320 
136380 
136440 
136500 
136560 
136620 
136680 
136740 
136800 
136860 
136920 
136980 
137040 
137100 
137160 
137220 
137280 
137340 
137400 
137460 
137520 
137580 
137640 
137700 
137760 
137820 
137880 
137940 
138000 
138060 
138120 
138180 
138240 
138300 
138360 
138420 
138480 
138540 
138600 
138660 
138720 
138780 
138840 
138900 
138960 
139020 
139080 
139140 
139200 
139260 
139320 
139380 
139440 
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cacataaaat 
ccattaaaca 
tgagtagatc 
atgccctatt 
actttagttt 
cctgattata 
taacttgtgt 
ctaggctgca 
aagttattga 
acagatttct 
aatgctagaa 
gaccagtgaa 
aagaccctgg 
ccttcttgcc 
tcatattctt 
catctcttat 
ctcaaagtcc 
agataaattg 
aaccaaagga 
tctgagaatt 
gaccagaatg 
attgtccact 
taattatctg 
gtaaattagt 
aagatctcat 
aaagtgctac 
tgacaagttc 
gatgtaaatt 
gagacagatt 
tttgtgataa 
gaatacacat 
atgcacaatt 
tactaaaata 
tcacctatat 
tatttttttc 
cttgatatct 
tttatacttc 
attggaattt 
atccttccct 
ttctttcctt 
ctttttcttt 
ttctttgttt 
atgatctcag 
tcctgagtag 
agtagagaca 
tcaacccgtc 
catttttaag 
ctg99tatat 
cttagggtga 
ataatggagg 
catgtgagcc 
gcaagaaatt 
tatgaaattt 
caacagacag 
atatcaatgt 
tataaacaaa 
atataaaaat 
tcaccgaatg 
ggattcagtc 
tctacctctc 
ttcaaaatct 
atttcccaca 
gcacactgtg 



ttttctgacg 
tgtgtgaaag 
ttaagttctc 
tttgtctaga 
tcagaatatt 
atcaaatcag 
caaggacctt 
caagtagtca 
ctgcatttga 
catgggctct 
ttaaaatatg 
acctgagcaa 
attgaagaag 
aattaagtta 
tacagtttcg 
ctttgttaga 
tcatttattc 
aattttaaca 
agagatagac 
ataatatatg 
tcaaattttc 
gctctatttc 
ctataagcac 
gtttttttaa 
acaatttaat 
tcgtattgac 
tccactaatg 
tcattttacg 
aacacatatg 
attcagtaaa 
taaaccactc 
taataatttt 
gtccctcttt 
acatgtagct 
caatttcatg 
aataagtaac 
cgtataaata 
atttgactct 
tagtgaacaa 
tcctttcctt 
ctttctttct 
ctgagacgaa 
tgcactacaa 
ctggggttac 
gggtttcacc 
ttggcctccc 
gctttttaat 
tagcagggag 
ttgagttaat 
aataggaaat 
aagcagggga 
aaaatatagg 
catatagaat 
acaggctaaa 
ttaagactca 
gaaccaatat 
acaccgagac 
gggatgtatg 
tctcagtctt 
catgtagctt 
ctattacaat 
gcataaggat 
attacaaatt 



tctaagtaaa 
tggtccatta 
ttgggttctc 
gacctagttt 
attttacttt 
tctgattata 
aaacagtaag 
cagaatcatt 
atgtttgttt 
atatctgtat 
gttaaygata 
atttgcaaag 
ttgctcacca 
caattcatag 
gcactgatac 
gtgcttgtca 
atatttgttt 
ttttggaaac 
ctttctggga 
atttgtcttc 
tgagcctggt 
tagcatccag 
gcatgcatac 
ctataaaatt 
agaaaatatg 
ttcctgagac 
acatttgctc 
caattgttag 
atattttaac 
tgcataatgt 
cctagaatta 
cccatttaaa 
ccaaccctga 
gttcctggac 
tcaataacac 
tctgcttatt 
ttagaatcag 
atgggtcact 
agaatctctc 
tcctttcttt 
ttcttccttc 
ctctcgcctc 
cctctgcctc 
aggtgcacac 
atgttggcca 
aaaatgctgg 
gttgtgtggc 
taagtataga 
tccttttatc 
tctttgatat 
ctataatgtt 
aacaaattaa 
tgagatattg 
ttgggatgtg 
aacgaaaggc 
acatatcaca 
agtaagttga 
agataagcat 
cagaatgaag 
tgtatgtata 
ttaatcttta 
actttcgttt 
cagcctcagt 



45 

ataaatgctc 
gcttcagtat 
ttattttata 
ttttacaagc 
tttcaacaaa 
atcaaatcag 
ttgatgaatg 
ctcgggtcat 
tgagtgtttc 
attagctgtc 
atggtctcca 
ctccttagag 
ccagagagtt 
gcattattct 
ctcggctggc 
ctatcatctg 
cctgggacag 
aatttctgtc 
aagtaaaata 
tgattcttct 
tactcaccta 
caccatgtct 
acaaatgcat 
tcactttaat 
tgtctgatat 
tgctcagaag 
atgtctgaat 
tatactgaaa 
ttaatttgat 
ctatataaat 
taagtaattt 
taatctataa 
tctgaaatgt 
tttctattct 
actgtcctga 
gccctgagag 
ctattccaca 
taggaaagac 
tccattttta 
cctttccttt 
cttccttcct 
tctttgccac 
ctgtgttcaa 
ctccatacct 
ggctggtctc 
gattataggc 
aatggcatca 
ttgagaacct 
ctaatggcta 
aggacttctt 
ggttctatgc 
aacatagatt 
atgaaggagc 
caaattggaa 
tttatttaca 
atgaaaagca 
actcggtgag 
ctcagtggga 
agctttgcct 
tgtaagcatg 
acatataagt 
ataattttgc 
ggaaaccact 



aaacaggctt 
gtggagctaa 
agattctggg 
aatacttgtg 
atacttgaaa 
tctgattcta 
aaattctcat 
tatcaaattc 
ttgcctctca 
tctgattgcc 
ctaaaagtgc 
actaagtcag 
aaaagaagcc 
agattcttgg 
tcgctttctt 
taactcaaag 
gataaggagg 
attacccaac 
ttacagaagg 
gcttatgagt 
actaaattta 
ggaacttgaa 
ggctgacaac 
aagagtttag 
tcttagatta 
agataagttt 
tgccccatgg 
ttgtttttaa 
attttatata 
tgtaagctac 
ctgtgcattg 
tccattaagc 
catctttgtt 
ggaacattag 
atactggttt 
atattgagtc 
agaattgtgt 
tgaccttttt 
gttctttctt 
cctttctttc 
tccttccttc 
ccaggctgga 
gtgattctcc 
ggctgatttg 
gaactcatga 
ataagccacc 
aagtgatcac 
gaaaacattt 
aactacacat 
tatacagaga 
agcatgaatg 
gatacctcaa 
ataaatacac 
aaaatgctta 
taagttccca 
ctactactaa 
atatatagta 
ttagtcactt 
gcattatcct 
tcattagaag 
tatctagatt 
attaagcctc 
tgttgcaaat 



tttattgtga 
cgaacttggg 
ttcttttcca 
cctcagtgag 
cataaatcag 
atcaaataac 
acctattttt 
atcatcatca 
gcagcagtgg 
attgtcccgc 
atgtccagat 
gactaaaacc 
cttgtgcctt 
cccagatttt 
cttcctcatc 
acaactctaa 
caataatatt 
cagtcaatgg 
agagcaccgt 
tgtaagctgg 
tttctactat 
aaatgctcaa 
atattacaaa 
caattgttgt 
ggtgaaaaat 
tatcagcttc 
aaaaaaatta 
ccaaaatttg 
tttaarataa 
aaaatataat 
tttaaagtaa 
caatagtaaa 
atatgccaca 
ttaaagtgta 
tacagcaaat 
ttacggttat 
ttgaattttg 
atgctattga 
tctctctctt 
tttcttcttt 
cttccttcct 
gtgcagtggc 
tgcctcagtc 
ttttattttt 
ccagaagtga 
acatccagcc 
ataatttatt 
ttttaatcat 
cagggttaac 
tataaaagag 
ttgtaaaata 
agataataca 
ttgtttaacc 
taaacacgtg 
aaaactaact 
aaacaacaaa 
tattagcaga 
catctctcta 
caaagtcctt 
ttctaagcat 
tttaaaagaa 
caacttaatt 
gtttagactt 



139500 
139560 
139620 
139680 
139740 
139800 
139860 
139920 
139980 
140040 
140100 
140160 
140220 
140280 
140340 
140400 
140460 
140520 
140580 
140640 
140700 
140760 
140820 
140880 
140940 
141000 
141060 
141120 
141180 
141240 
141300 
141360 
141420 
141480 
141540 
141600 
141660 
141720 
141780 
141840 
141900 
141960 
142020 
142080 
142140 
142200 
142260 
142320 
142380 
142440 
142500 
142560 
142620 
142680 
142740 
142800 
142860 
142920 
142980 
143040 
143100 
143160 
143220 
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gcttcatagt ataactgttc atcttcagtt acagaactgc tactgagata acataactaa 143280 
agccttttgg ctctttttat acaaagcatg atatttaact agggttttag tgatttttaa 143340 
aaagtttctc tttctcctta gatattcaga ccaatgcgtc tcatatgaga tgaagaaatg 143400 
tccagaagaa actatactga actgacagaa tttgttctct tgggtctaac aagccgtcca 143460 
gagctgcgag ttgctttctt ggcactgttc ctttttgtct acatagccac tgtggtagga 14352 0 
aacttgggga tgattatttt aatcaaagtt gattctcgac ttcacactcc catgtaattt 143580 
tttctctcca gtttgtccat tctagatctg tgtttctcca caaatttcac tcccaaaatg 143640 
ctagaaaatt tcttatcaga gaagaagacc atttcctatg caggttgttt gatgcagtgc 143700 
tatgttgtca ttgctgtggt ccttgcagag cactgcatgt tggcagtcat ggcatatgac 143760 
cgctatatgg ccatctgtaa tccattgctc tacagtagca aaatgtccca aggtgtttgt 143820 
gtccacctgg tcattgtccc ttatgtctat ggctttcttc tcagtgtgat ggaaacctta 143880 
aggacctaca acctctcctt ctgtggaaca aatgaaatca accatttcta ctgtgctgat 143940 
cctcctctta tcaaactggc atgctctgac acgtacagca aggagctgtc catgtacata 144000 
gtagccggct acagcaacgt ccagtctctt ctratcattc tcacatccta catgttcatc 144060 
cttgtcgcta tcctcagaag ccattctgca gagggaagga aaaaagcttt ttccacatgt 14412 0 
ggttcccacc tgacagttgt cacaatcttc tatggaaccc tcttctgcat gcatttgaga 144180 
cgtcccacag acgagtccgt ggagcagggg aaaatggtgg ctgtgtttta caccacagtg 14424 0 
atactcatgc tgaactccat gatctatggc ctcaggaaca aggatgtgaa agaggcgttg 1443 00 
aaaaaagcaa taggaaaaca aacattggga aaataaaaat gctaagctat cattaaaaat 144360 
ttgtgaagta atgagatata atatcattgg gttagatgtc acattttagg ctacatttgc 14442 0 
acaattcatt tctaattttc tgttaggtag ctgactgagt 144460 



<210> 2 

<211> 195 

<212> DNA 

<213> Homo sapiens 



<400> 2 

atg age ttc tta ata aga 
Met Ser Phe Leu lie Arg 
1 5 
ttg ttc etc agt cat etc 
Leu Phe Leu Ser His Leu 
20 

gcc act cct ccg atg ctg 
Ala Thr Pro Pro Met Leu 
35 

ttt cct tta ttg gtt get 
Phe Pro Leu Leu Val Ala 
50 

tga 
65 



agt gat tea aca eta eac 
Ser Asp Ser Thr Leu His 
10 

tec ttt gta gat etc tat 
Ser Phe Val Asp Leu Tyr 
25 

gtt aac ttt ttt ttt cea 
Val Asn Phe Phe Phe Pro 
40 

tta tec aat ttc ace ttt 
Leu Ser Asn Phe Thr Phe 
55 60 



act cea atg tge 48 
Thr Pro Met Cys 
15 

tat gcc ace aat 96 
Tyr Ala Thr Asn 
30 

aga gaa aaa ccg 144 

Arg Glu Lys Pro 

45 

tea ttg cac tgg 192 
Ser Leu His Trp 

195 



<210> 3 

<211> 948 

<212> DNA 

<213> Homo sapiens 



<400> 3 

atg ttc tec cea aac cac 
Met Phe Ser Pro Asn His 
1 5 
ctg aca gae gae cea gtg 
Leu Thr Asp Asp Pro Val 
20 

gcg ate tac eta ate aca 
Ala lie Tyr Leu lie Thr 

35 

ate agg ace aat tec eac 
lie Arg Thr Asn Ser His 
50 

cac etc tec ttt gta gae 



ace ata gtg aca gaa ttc 
Thr lie Val Thr Glu Phe 
10 

eta gag aag ate ctg ttt 
Leu Glu Lys lie Leu Phe 
25 

Ctg gea gge aac ctg tge 
Leu Ala Gly Asn Leu Cys 
40 

ctg caa aca ccc atg tat 
Leu Gin Thr Pro Met Tyr 
55 60 
att tge tat tet tec aat 



att etc ttg gga 48 
lie Leu Leu Gly 
15 

999 9ta ttc ett 96 
Gly Val Phe Leu 
30 

atg ate ctg ctg 144 
Met lie Leu Leu 

45 

ttc ttc ctt ggc 192 
Phe Phe Leu Gly 

gtt act cea aat 240 
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His Leu Ser Phe Val Asp lie Cys Tyr Ser Ser Asn Val Thr Pro Asn 

65 70 75 80 

atg ctg cac aat ttc etc tea gaa cag aag acc ate tec tac get gga 288 

Met Leu His Asn Phe Leu Ser Glu Gin Lys Thr lie Ser Tyr Ala Gly 

85 90 95 

tgc ttc aca cag tgt ctt etc ttc ate gee eta gtg ate act gag ttt 336 
Cys Phe Thr Gin Cys Leu Leu Phe lie Ala Leu Val lie Thr Glu Phe 

100 105 110 

tac ttc ctt get tea atg gca ttg gat cgc tat gta gcc att tgc age 3 84 

Tyr Phe Leu Ala Ser Met Ala Leu Asp Arg Tyr Val Ala lie Cys Ser 

115 120 125 

ect tta cat tac agt tee agg atg tec aag aac att tgc ate tet ctg 432 
Pro Leu His Tyr Ser Ser Arg Met Ser Lys Asn lie Cys lie Ser Leu 

130 135 140 

gtc act gtg ect tac atg tat gge ttc ctt aat ggg etc tet cag aca 480 
Val Thr Val Pro Tyr Met Tyr Gly Phe Leu Asn Gly Leu Ser Gin Thr 
145 150 155 160 

ctg ctg ace ttt cac tta tec ttc tgt gge tee ctt gaa ate aat eat 528 
Leu Leu Thr Phe His Leu Ser Phe Cys Gly Ser Leu Glu lie Asn His 

165 170 175 

ttc tac tgc get gat ect ect ctt ate atg ctg gcc tgc tet gac acc 576 
Phe Tyr Cys Ala Asp Pro Pro Leu lie Met Leu Ala Cys Ser Asp Thr 

180 185 190 

cgt gtc aaa aag atg gca atg ttt gta gtt gca gge ttt act etc tea 624 
Arg Val Lys Lys Met Ala Met Phe Val Val Ala Gly Phe Thr Leu Ser 

195 200 205 

age tet etc ttc ate att ctt ctg tec tat ctt ttc att ttt gca gcg 672 
Ser Ser Leu Phe lie lie Leu Leu Ser Tyr Leu Phe lie Phe Ala Ala 

210 215 220 

ate ttc agg ate cgt tet get gaa gge agg cac aaa gee ttt tet aeg 720 
lie Phe Arg lie Arg Ser Ala Glu Gly Arg His Lys Ala Phe Ser Thr 
225 230 235 240 

tgt get tec cac ctg aca ata gtc act ttg ttt tat gga acc etc ttc 768 
Cys Ala Ser His Leu Thr lie Val Thr Leu Phe Tyr Gly Thr Leu Phe 

245 250 255 

tgc atg tac gta agg ect cea tea gag aag tet gta gag gag tee aaa 816 
Cys Met Tyr Val Arg Pro Pro Ser Glu Lys Ser Val Glu Glu Ser Lys 

260 265 270 

ata act gca gtc ttt tat act ttt ttg ace cea atg ctg aac cea ttg 864 
lie Thr Ala Val Phe Tyr Thr Phe Leu Thr Pro Met Leu Asn Pro Leu 

275 280 285 

ate tat age eta egg aac aca gat gta ate ctt gcc atg caa caa atg 912 
lie Tyr Ser Leu Arg Asn Thr Asp Val lie Leu Ala Met Gin Gin Met 

290 295 300 

att agg gga aaa tee ttt cat aaa att gca gtt tag 948 
lie Arg Gly Lys Ser Phe His Lys lie Ala Val * 
305 310 315 

<210> 4 
<211> 519 
<212> DNA 

<213> Homo sapiens 



<400> 4 

atg tta aag aaa aac cat aca gcc gtg act gag ttt gtt etc ctg gga 4 8 

Met Leu Lys Lys Asn His Thr Ala Val Thr Glu Phe Val Leu Leu Gly 
15 10 15 

ctg aca gat egg get gag ctg cag tec ctt ctt ttt gtg gta ttt eta 96 
Leu Thr Asp Arg Ala Glu Leu Gin Ser Leu Leu Phe Val Val Phe Leu 

20 25 30 

gtc ate tac ctt ate aca gta ate gge aat gtg age atg ate ttg tta 144 
Val lie Tyr Leu lie Thr Val lie Gly Asn Val Ser Met lie Leu Leu 
35 40 45 
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ate aga agt gac teg aca eta cac act cca atg tac ttc ttc etc agt 192 

lie Arg Ser Asp Ser Thr Leu His Thr Pro Met Tyr Phe Phe Leu Ser 

50 55 60 

cac etc tee ttt gta gat etc tgt tat ace ace aat gtt act cet eag 240 

His Leu Ser Phe Val Asp Leu Cys Tyr Thr Thr Asn Val Thr Pro Gin 

65 70 75 80 

atg etg gtt aac ttt tta tee aag aga aaa ace att tec ttc ate gge 288 

Met Leu Val Asn Phe Leu Ser Lys Arg Lys Thr lie Ser Phe lie Gly 

85 90 95 

tgc ttt ate caa ttt cac ttt ttc att gca etg gtg att aca gat tat 336 

Cys Phe lie Gin Phe His Phe Phe lie Ala Leu Val lie Thr Asp Tyr 

100 105 110 

tat atg etc aca gtg atg get tat gae ege tac atg gee ate tgc aag 384 

Tyr Met Leu Thr Val Met Ala Tyr Asp Arg Tyr Met Ala lie Cys Lys 

115 120 125 

ccc ttg tta tat gga age aaa atg aec.agg tgt gtc tgc etc tgt etg 432 

Pro Leu Leu Tyr Gly Ser Lys Met Thr Arg Cys Val Cys Leu Cys Leu 

130 135 140 

get get get ccc tat att tat gge ttt gca aat ggt eta age aca gac 480 

Ala Ala Ala Pro Tyr lie Tyr Gly Phe Ala Asn Gly Leu Ser Thr Asp 

145 150 155 160 

cac cet gat get teg tet gtc ctt etg tgg acc caa tga 519 

His Pro Asp Ala Ser Ser Val Leu Leu Trp Thr Gin * 

165 170 

<210> 5 

<211> 948 

<212> DNA 

<213> Homo sapiens 

<400> 5 

atg ttg tec cca aac cac acc ata gtg aca gaa ttc att etc tta gga 48 

Met Leu Ser Pro Asn His Thr lie Val Thr Glu Phe lie Leu Leu Gly 

15 10 15 

etg aca gac gac cca gtg eta gag aag ate etg ttt ggg gtg ttc etg 96 

Leu Thr Asp Asp Pro Val Leu Glu Lys lie Leu Phe Gly Val Phe Leu 

20 25 30 

gcg ate tac eta ate aca etg gca gge aac etg tgc atg ate etg etg 144 

Ala lie Tyr Leu lie Thr Leu Ala Gly Asn Leu Cys Met lie Leu Leu 

35 40 45 

ate agg ace aat tee caa etg caa aca ccc atg tat ttc ttc ctt ggt 192 

lie Arg Thr Asn Ser Gin Leu Gin Thr Pro Met Tyr Phe Phe Leu Gly 

50 55 60 

cac etc tee ttt tta gac att tgc tat tet tee aat gtt act cca aat 240 

His Leu Ser Phe Leu Asp lie Cys Tyr Ser Ser Asn Val Thr Pro Asn 

65 70 75 80 

atg etg cac aat ttc etc tea gaa eag aag acc ate tec tac get gga 2 88 

Met Leu His Asn Phe Leu Ser Glu Gin Lys Thr lie Ser Tyr Ala Gly 

85 90 95 

tgc ttc aca eag tgt ctt etc ttc ate gee eta gtg ate act gag ttt 336 

Cys Phe Thr Gin Cys Leu Leu Phe lie Ala Leu Val lie Thr Glu Phe 

100 105 110 

tac ttc ctt get tea atg gca ttg gat cgc tat gta gee att tgc age 3 84 

Tyr Phe Leu Ala Ser Met Ala Leu Asp Arg Tyr Val Ala lie Cys Ser 

115 120 125 

cet tta eat tac agt tee agg atg tec aag aac att tgc ate tet etg 432 

Pro Leu His Tyr Ser Ser Arg Met Ser Lys Asn lie Cys lie Ser Leu 

130 135 140 

gtc act gtg cet tac atg tat gge ttc ctt aat ggg etc tet cag aca 4 80 

Val Thr Val Pro Tyr Met Tyr Gly Phe Leu Asn Gly Leu Ser Gin Thr 

145 150 155 160 

etg etg acc ttt cac tta tec ttc tgt gge tec ctt gaa ate aat eat 528 

Leu Leu Thr Phe His Leu Ser Phe Cys Gly Ser Leu Glu lie Asn His 
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165 










170 










175 






ttc 


tac 


tgc 


get 


gat 


cct 


cct 


ctt 


ate 


atg 


etg 


gee 


tgc 


tct 


gac 


ace 


576 


Phe 


Tyr 


Cys 


Ala 


Asp 


Pro 


Pro 


Leu 


He 


Met 


Leu 


Ala 


Cys 


Ser 


Asp 


Thr 










180 










185 










190 








cgt 


gtc 


aaa 


aag 


atg 


gea 


atg 


ttt 


gta 


gtt 


gea 


ggc 


ttt 


act 


etc 


tea 


624 


Arg 


Val 


Lys 


Lys 


Met 


Ala 


Met 


Phe 


Val 


Val 


Ala 


Gly 


Phe 


Thr 


Leu 


Ser 








195 










200 










205 










age 


tct 


ete 


tte 


ate 


att 


ctt 


etg 


tee 


tat 


ctt 


tte 


att 


ttt 


gea 


geg 


672 


Ser 


Ser 


Leu 


Phe 


He 


He 


Leu 


Leu 


Ser 


Tyr 


Leu 


Phe 


He 


Phe 


Ala 


Ala 






210 










215 










220 












ate 


ttc 


agg 


ate 


cgt 


tct 


get 


gaa 


ggc 


agg 


eac 


aaa 


gee 


ttt 


tct 


acg 


720 


He 


Phe 


Arg 


He 


Arg 


Ser 


Ala 


Glu 


Gly 


Arg 


His 


Lys 


Ala 


Phe 


Ser 


Thr 




225 










230 










235 










240 




tgt 


get 


tee 


eae 


etg 


aca 


ata 


gte 


act 


ttg 


ttt 


tat 


gga 


ace 


ete 


ttc 


768 


Cys 


Ala 


Ser 


His 


Leu 


Thr 


He 


Val 


Thr 


Leu 


Phe 


Tyr 


Gly 


Thr 


Leu 


Phe 












245 










250 










255 






tgc 


atg 


tae 


gta 


agg 


cct 


cca 


tea 


gag 


aag 


tct 


gta 


gag 


gag 


tec 


aaa 


816 


Cys 


Met 


Tyr 


Val 


Arg 


Pro 


Pro 


Ser 


Glu 


Lys 


Ser 


Val 


Glu 


Glu 


Ser 


Lys 










260 










265 










270 








ata 


att 


gea 


gte 


ttt 


tat 


act 


ttt 


ttg 


age 


eea 


atg 


etg 


aae 


eea 


ttg 


864 


He 


He 


Ala 


Val 


Phe 


Tyr 


Thr 


Phe 


Leu 


Ser 


Pro 


Met 


Leu 


Asn 


Pro 


Leu 








275 










280 










285 










ate 


tat 


age 


eta 


egg 


aae 


aga 


gat 


gta 


ate 


ctt 


gee 


ata 


eaa 


eaa 


atg 


912 


He 


Tyr 


Ser 


Leu 


Arg 


Asn 


Arg 


Asp 


Val 


He 


Leu 


Ala 


He 


Gin 


Gin 


Met 






290 










295 










300 












att 


agg 


gga 


aaa 


tec 


ttt 


tgt 


aaa 


att 


gea 


gtt 


tag 










948 


He 


Arg 


Gly Lys 


Ser 


Phe 


Cys 


Lys 


He 


Ala 


Val 


* 













305 310 315 



<210> 6 

<211> 918 

<212> DNA 

<213> Homo sapiens 



<400> 6 

atg tee aae aca aat ggc 
Met Ser Asn Thr Asn Gly 
1 5 

ete aca gat tgc ecg gaa 
Leu Thr Asp Cys Pro Glu 

20 

gtt gtt tac etc gte ace 
Val Val Tyr Leu Val Thr 

35 

atg aga etg gae tct ege 
Met Arg Leu Asp Ser Arg 
50 

aac tta gcc ttt gtg gat 
Asn Leu Ala Phe Val Asp 
65 70 

atg teg act aat ate gta 
Met Ser Thr Asn He Val 
85 

ttt aca cag tgc tac att 
Phe Thr Gin Cys Tyr He 
100 

atg etg gea gea atg gee 
Met Leu Ala Ala Met Ala 
115 

etg egc tac agt gtg aaa 
Leu Arg Tyr Ser Val Lys 
130 

aca ttt ccc tat gtc tat 



agt gea ate aca gaa ttc 
Ser Ala He Thr Glu Phe 
10 

etc cag tct etg ctt ttt 
Leu Gin Ser Leu Leu Phe 
25 

etg eta ggc aae etg ggc 
Leu Leu Gly Asn Leu Gly 
40 

ctt cac acg ccc atg tae 
Leu His Thr Pro Met Tyr 
55 60 
ttg tgc tat aca tea aat 
Leu Cys Tyr Thr Ser Asn 
75 

tct gag aag ace att tee 
Ser Glu Lys Thr He Ser 
90 

ttc att gcc ctt eta etc 
Phe He Ala Leu Leu Leu 
105 

tat gac egc tat gtg gee 
Tyr Asp Arg Tyr Val Ala 
120 

acg tec agg aga gtt tgc 
Thr Ser Arg Arg Val Cys 
135 140 
ggc ttc tea gat gga etc 



att tta ctt ggg 48 
He Leu Leu Gly 
15 

gtg etg ttt etg 96 
Val Leu Phe Leu 
30 

atg ata atg tta 144 
Met He Met Leu 

45 

tte ttc ete act 192 
Phe Phe Leu Thr 

gea ace ecg cag 240 
Ala Thr Pro Gin 
80 

ttt get ggt tgc 288 
Phe Ala Gly Cys 
95 

act gag ttt tac 336 
Thr Glu Phe Tyr 

110 

ata tat gae cct 3 84 

He Tyr Asp Pro 

125 

ate tgc ttg gee 432 
He Cys Leu Ala 

ttc cag gcc ate 480 
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50 



Thr 


Phe 


Pro 


Tyr 


Val 


Tyr 


Gly 


Phe 


Ser 


Asp 


Gly 


Leu 


Phe 


Gin 


Ala 


He 




145 










150 










155 










160 




ctg 


ace 


ttc 


cgc 


ctg 


ace 


ttc 


tgt 


aga 


tec 


aat 


gtc 


ate 


aac 


cac 


ttc 


528 


Leu 


Thr 


Phe 


Arg 


Leu 


Thr 


Phe 


Cys 


Arg 


Ser 


Asn 


Val 


He 


Asn 


His 


Phe 












165 










170 










175 






tac 


tgt 


get 


gae 


ecg 


ecg 


etc 


att 


aag 


ett 


tct 


tgt 


tct 


gat 


act 


tat 


576 


Tyr 


Cys 


Ala 


Asp 


Pro 


Pro 


Leu 


He 


Lys 


Leu 


Ser 


Cys 


Ser 


Asp 


Thr 


Tyr 










180 










185 










190 








gtc 


aaa 


gag 


cat 


gee 


atg 


ttc 


ata 


tct 


get 


ggc 


ttc 


aac 


etc 


tec 


age 


624 


Val 


Lys 


Glu 


His 


Ala 


Met 


Phe 


He 


Ser 


Ala 


Gly 


Phe 


Asn 


Leu 


Ser 


Ser 






195 










200 










205 










tec 


etc 


ace 


ate 


gtc 


ttg 


gtg 


tec 


tat 


gee 


ttc 


att 


ett 


get 


gee 


ate 


672 


Ser 


Leu 


Thr 


He 


Val 


Leu 


Val 


Ser 


Tyr 


Ala 


Phe 


He 


Leu 


Ala 


Ala 


He 






210 










215 










220 












etc 


egg 


ate 


aaa 


tea 


gca 


gag 


gga 


agg 


cac 


aag 


gca 


ttc 


tee 


ace 


tgt 


720 


Leu 


Arg 


He 


Lys 


Ser 


Ala 


Glu 


Gly 


Arg 


His 


Lys 


Ala 


Phe 


Ser 


Thr 


Cys 




225 










230 










235 










240 




ggt 


tec 


eat 


atg 


atg 


get 


gtc 


ace 


ctg 


ttt 


tat 


ggg 


act 


etc 


ttt 


tgc 


768 


Gly 


Ser 


His 


Met 


Met 


Ala 


Val 


Thr 


Leu 


Phe 


Tyr 


Gly 


Thr 


Leu 


Phe 


Cys 












245 










250 










255 






atg 


tat 


ata 


aga 


cea 


cea 


aca 


gat 


aag 


act 


gtt 


gag 


gaa 


tct 


aaa 


ata 


816 


Met 


Tyr 


He 


Arg 


Pro 


Pro 


Thr 


Asp 


Lys 


Thr 


Val 


Glu 


Glu 


Ser 


Lys 


He 










260 










265 










270 








ata 


get 


gtc 


ttt 


tac 


acc 


ttt 


gtg 


agt 


ecg 


gta 


ett 


aat 


cea 


ttg 


ate 


864 


lie 


Ala 


Val 


Phe 


Tyr 


Thr 


Phe 


Val 


Ser 


Pro 


Val 


Leu 


Asn 


Pro 


Leu 


He 








275 










280 










285 










tac 


agt 


ctg 


agg 


aat 


aaa 


gat 


gtg 


aag 


cag 


gee 


ttg 


aag 


aat 


gtc 


ctg 


912 


Tyr 


Ser 


Leu 


Arg 


Asn 


Lys 


Asp 


Val 


Lys 


Gin 


Ala 


Leu 


Lys 


Asn 


Val 


Leu 





290 295 300 

aga tga 918 
Arg * 
305 



<210> 7 

<211> 612 

<212> DNA 

<213> Homo sapiens 



<400> 7 

atg gtt aga gga aat tct act ttg gtg aeg gaa ttt att etc ttg gga 48 
Met Val Arg Gly Asn Ser Thr Leu Val Thr Glu Phe He Leu Leu Gly 
15 10 15 

tta aag gat ett cea gag ett cag ccc ate etc ttt gta ctg ttc ctg 96 
Leu Lys Asp Leu Pro Glu Leu Gin Pro He Leu Phe Val Leu Phe Leu 

20 25 30 

eta ate tac ctg ate act gtc ggg ggg aac ett ggg atg ttg gtg ttg 144 
Leu He Tyr Leu He Thr Val Gly Gly Asn Leu Gly Met Leu Val Leu 

35 40 45 

ate agg ata gat tea cgc etc cac ace ccc atg tat ttc ttt ett get 192 
He Arg He Asp Ser Arg Leu His Thr Pro Met Tyr Phe Phe Leu Ala 

50 55 60 

agt ttg tec tgc ttg gat ttg tat tac tee act aat gtg act ccc aag 240 
Ser Leu Ser Cys Leu Asp Leu Tyr Tyr Ser Thr Asn Val Thr Pro Lys 
65 70 75 80 

atg ttg gtg aac ttc ttc tea gae aag aaa gee att tee tat get get 288 
Met Leu Val Asn Phe Phe Ser Asp Lys Lys Ala He Ser Tyr Ala Ala 

85 90 95 

tgt tta gtc cag tgc tat ttt ttc att get gtg gtg att act gaa tat 33 6 

Cys Leu Val Gin Cys Tyr Phe Phe He Ala Val Val He Thr Glu Tyr 

100 105 110 

tat atg eta get gta atg gee tat gat agg tat gtg gee ate tgt aac 384 
Tyr Met Leu Ala Val Met Ala Tyr Asp Arg Tyr Val Ala He Cys Asn 
115 120 125 
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51 

cct ttg ctt tac age age aag atg tec aaa ggg etc tgt att cgc ctg 432 
Pro Leu Leu Tyr Ser Ser Lys Met Ser Lys Gly Leu Cys lie Arg Leu 

130 135 140 

att get ggt oca tat gtc tat ggg ttt ctt agt gga ctg atg gaa ace 4 80 

lie Ala Gly Pro Tyr Val Tyr Gly Phe Leu Ser Gly Leu Met Glu Thr 
145 150 155 160 

atg tgg aca tac cac ttg ace ttc tgt ggc tec aat ate att aat cac 528 
Met Trp Thr Tyr His Leu Thr Phe Cys Gly Ser Asn lie lie Asn His 

165 170 175 

ttc tac tgt get gac eca ccc etc ate cga ctt tec tgc tct gac act 576 
Phe Tyr Cys Ala Asp Pro Pro Leu lie Arg Leu Ser Cys Ser Asp Thr 

180 185 190 

ttc att aag gaa aca tec atg ttt gtg gta gca tga 612 
Phe He Lys Glu Thr Ser Met Phe Val Val Ala * 
195 200 



<210> 8 

<211> 807 

<212> DNA 

<213> Homo sapiens 



<400> 8 

ttg ccc tea tec agg eca 
Leu Pro Ser Ser Arg Pro 
1 5 
ttc ctg age aac tta tec 
Phe Leu Ser Asn Leu Ser 
20 

act eca agg atg ctg gag 
Thr Pro Arg Met Leu Glu 
35 

tat cct gee egt ctt gtg 
Tyr Pro Ala Arg Leu Val 
50 

gtt gag etc tac ate ctg 
Val Glu Leu Tyr He Leu 
65 70 
ate tgc aac cct ctg ctt 
He Cys Asn Pro Leu Leu 
85 

tct ttc etc ate aca gtg 
Ser Phe Leu He Thr Val 
100 

atg gag act atg tgg ace 
Met Glu Thr Met Trp Thr 
115 

att aat cac ttc tac tgt 
He Asn His Phe Tyr Cys 
130 

tct gac ace tac aac aag 
Ser Asp Thr Tyr Asn Lys 
145 150 
aac ttc act tat cct etc 
Asn Phe Thr Tyr Pro Leu 
165 

ttt cct gee ace eta agg 
Phe Pro Ala Thr Leu Arg 
180 

ttt tct ace tgt ggc tec 
Phe Ser Thr Cys Gly Ser 
195 

get ctt ttc ttc atg tat 
Ala Leu Phe Phe Met Tyr 



aeg ccc egg etc cac aeg 
Thr Pro Arg Leu His Thr 
10 

ttt gtg gat ctg tgc ttc 
Phe Val Asp Leu Cys Phe 

25 

att ttc ctt tea gag aag 
He Phe Leu Ser Glu Lys 
40 

cag tgt tac ctt ttt ate 
Gin Cys Tyr Leu Phe He 
55 60 
get gtg atg gee ttt gac 
Ala Val Met Ala Phe Asp 
75 

tat ggc age aga atg tee 
Tyr Gly Ser Arg Met Ser 
90 

ctt tat gtg tat gga gca 
Leu Tyr Val Tyr Gly Ala 
105 

tac aac eta gee ttc tgt 
Tyr Asn Leu Ala Phe Cys 
120 

gtg gac eca eca ctg att 
Val Asp Pro Pro Leu He 
135 140 
gag gtg tea atg ttt gtt 
Glu Val Ser Met Phe Val 
155 

ctt ate ate etc att tec 
Leu He He Leu He Ser 

170 

ate tgc tct aca gaa ggc 
He Cys Ser Thr Glu Gly 
185 

cat ctg aca gcc gtt act 
His Leu Thr Ala Val Thr 
200 

etc aga egt eca tea gaa 
Leu Arg Arg Pro Ser Glu 



ccc atg tac ttt 48 
Pro Met Tyr Phe 
15 

tct tec aat gtg 96 
Ser Ser Asn Val 

30 

aaa age att tec 144 

Lys Ser He Ser 

45 

ace ttg gtc cac 192 
Thr Leu Val His 

egg tac atg gee 240 
Arg Tyr Met Ala 
80 

aag age gtg tgc 2 88 

Lys Ser Val Cys 
95 

etc act ggc ctg 336 
Leu Thr Gly Leu 
110 

ggc ccc agt gaa 384 

Gly Pro Ser Glu 

125 

aag ctg get tgt 432 
Lys Leu Ala Cys 

gtg get ggt ttc 480 
Val Ala Gly Phe 
160 

tat etc tac ata 528 

Tyr Leu Tyr He 
175 

agg cac aaa get 576 
Arg His Lys Ala 
190 

att ttc tat tea 624 
He Phe Tyr Ser 

205 

gag tec atg gag 672 
Glu Ser Met Glu 
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210 
cag ggg 
Gin Gly 
225 

aat ccc 
Asn Pro 

tgc aaa 
Cys Lys 



aaa atg 
Lys Met 

atg ate 
Met lie 

gaa ctg 
Glu Leu 
260 



52 



gta get 
Val Ala 
230 
tac agt 
Tyr Ser 
245 

ttc aaa 
Phe Lys 



215 

gta ttt 
Val Phe 

ctg agg 
Leu Arg 

aga aaa 
Arg Lys 



220 



tat acc act gta ate ccc atg ttg 720 
Tyr Thr Thr Val He Pro Met Leu 
235 240 
aac aaa gat gtg aaa gag gca tta 768 
Asn Lys Asp Val Lys Glu Ala Leu 

250 255 
ttg ttt tct aaa taa 807 
Leu Phe Ser Lys * 
265 



<210> 9 

<211> 363 

<212> DNA 

<213> Homo sapiens 



<400> 9 
atg aga 
Met Arg 
1 

acg aat 
Thr Asn 

att tac 
He Tyr 



cag 
Gin 

tta 

Leu 

65 

Ctg 

Leu 



gee 

Ala 

50 

tec 

Ser 

gag 
Glu 



ctt gtt 
Leu Val 

ate ctg 
He Leu 



agg aac 
Arg Asn 

cac cag 
His Gin 

20 
atg gte 
Met Val 
35 

aat gee 
Asn Ala 

ttc ctg 
Phe Leu 

att ttc 
He Phe 

cag tgt 
Gin Cys 
100 

get gtg 
Ala Val 
115 



ttc 
Phe 
5 

gaa 

Glu 

aca 
Thr 

egg 
Arg 

gat 
Asp 

ctt 

Leu 

85 

tac 

Tyr 

atg 
Met 



acg ttg 
Thr Leu 

tta cag 
Leu Gin 

gtg gca 
Val Ala 

etc eac 
Leu His 

55 
ctg tgc 
Leu Cys 
70 

tea gag 
Ser Glu 

ctt tat 
Leu Tyr 

gee ttt 
Ala Phe 



gtg 
Val 

att 

He 

ggg 

Gly 
40 
acg 
Thr 

ttc 
Phe 

aag 
Lys 

ate 
He 

gae 
Asp 
120 



act 
Thr 

etc 

Leu 

25 

aat 

Asn 

ccc 
Pro 

tct 
Ser 

aaa 
Lys 

ate 
He 
105 
tag 



gag 

Glu 

10 

etc 

Leu 

ctt 
Leu 

atg 
Met 

tee 
Ser 

age 

Ser 

90 

ttg 

Leu 



ttc 
Phe 

ttc 
Phe 

age 
Ser 

tac 

Tyr 

aat 

Asn 

75 

att 

He 

gta 
Val 



att 
He 

atg 
Met 

atg 
Met 

ttt 

Phe 

60 

gtg 

Val 

tec 
Ser 

cac 
His 



etc ctg gga ctg 
Leu Leu Gly Leu 
15 

ctg ttt ctg gee 

Leu Phe Leu Ala 

30 

att gee etc ate 
He Ala Leu He 
45 

ttc ctg age eae 
Phe Leu Ser His 



ace eca 
Thr Pro 

tat cet 
Tyr Pro 

gtt gag 
Val Glu 
110 



aag atg 
Lys Met 

80 
gee tgt 
Ala Cys 
95 

ate tac 
He Tyr 



48 



96 



144 



192 



240 



288 



336 



363 



<210> 10 

<211> 936 

<212> DNA 

<213> Homo sapiens 



<400> 10 

atg aga aga aac 
Met Arg Arg Asn 
1 

acc agt cgc egg 
Thr Ser Arg Arg 

20 

att tac atg gte 
He Tyr Met Val 
35 

cag gcc aac gcc 
Gin Ala Asn Ala 
50 

tta tec ttc gtg 
Leu Ser Phe Val 
65 



tgc acg ttg gtg 
Cys Thr Leu Val 
5 

gaa tta caa att 
Glu Leu Gin He 

aeg gtg gca ggg 
Thr Val Ala Gly 
40 

tgg etc cac atg 
Trp Leu His Met 
55 

gat ctg tgc ttc 
Asp Leu Cys Phe 
70 



act gag 
Thr Glu 
10 

etc etc 
Leu Leu 
25 

aac ctt 
Asn Leu 



ttc att 
Phe He 

ttc acg 
Phe Thr 

gge atg 
Gly Met 



ccc atg tac ttt 
Pro Met Tyr Phe 
60 

tct tee aat gtg 
Ser Ser Asn Val 
75 



etc ctg gga ctg 
Leu Leu Gly Leu 
15 

ctg ttt ctg gcc 
Leu Phe Leu Ala 
30 

att gte etc ate 
He Val Leu He 
45 

ttc ctg -age cac 
Phe Leu Ser His 

act eca aag atg 
Thr Pro Lys Met 
80 



48 



96 



144 



192 



240 
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288 



53 

ctg gag att ttc ctt tea gag aag aaa age att tec tat cct gcc tgt 
Leu Glu lie Phe Leu Ser Glu Lys Lys Ser lie Ser Tyr Pro Ala Cys 

85 90 95 

Ctt gtg cag tgt tac ctt ttt ate gee ttg gte cat gtt gag ate tae 336 
Leu Val Gin Cys Tyr Leu Phe lie Ala Leu Val His Val Glu lie Tyr 

100 105 110 

ate ctg get gtg atg gcc ttt gac egg tae atg gee ate tgc aac cct 384 
lie Leu Ala Val Met Ala Phe Asp Arg Tyr Met Ala lie Cys Asn Pro 

115 120 125 

ctg ctt tat ggc age aga atg tec aag agt gtg tgc tec ttc etc ate 432 
Leu Leu Tyr Gly Ser Arg Met Ser Lys Ser Val Cys Ser Phe Leu lie 

130 135 140 

aeg gtg cct tat gtg tat gga geg etc act ggc ctg atg gag ace atg 4 80 

Thr Val Pro Tyr Val Tyr Gly Ala Leu Thr Gly Leu Met Glu Thr Met 
145 150 155 160 

tgg ace tac aac eta gcc ttc tgt ggc cec aat gaa att aat cac ttc 528 
Trp Thr Tyr Asn Leu Ala Phe Cys Gly Pro Asn Glu lie Asn His Phe 

165 170 175 

tac tgt geg gac cea cca ctg att aag ctg get tgt tet gac ace tae 576 
Tyr Cys Ala Asp Pro Pro Leu lie Lys Leu Ala Cys Ser Asp Thr Tyr 

180 185 190 

aac aag gag ttg tea atg ttt att gtg get ggc tgg aac ctt tet ttt 624 
Asn Lys Glu Leu Ser Met Phe lie Val Ala Gly Trp Asn Leu Ser Phe 

195 200 205 

tet etc ttc ate ata tgt att tee tae ctt tac att ttc cct get att 672 
Ser Leu Phe lie lie Cys lie Ser Tyr Leu Tyr lie Phe Pro Ala lie 

210 215 220 

tta aag att ege tet aca gag ggc agg caa aaa get ttt tet ace tgt 720 
Leu Lys lie Arg Ser Thr Glu Gly Arg Gin Lys Ala Phe Ser Thr Cys 
225 230 235 240 

ggc tec cat ctg aca get gte act ata ttc tat gea ace ctt ttc ttc 768 
Gly Ser His Leu Thr Ala Val Thr lie Phe Tyr Ala Thr Leu Phe Phe 

245 250 255 

atg tat etc aga ece cec tea aag gaa tet gtt gaa eag ggt aaa atg 816 
Met Tyr Leu Arg Pro Pro Ser Lys Glu Ser Val Glu Gin Gly Lys Met 

260 265 270 

gta get gta ttt tat acc aca gta ate cct atg ctg aac ctt ata att 864 
Val Ala Val Phe Tyr Thr Thr Val lie Pro Met Leu Asn Leu lie lie 

275 280 285 

tat age ctt aga aat aaa aat gta aaa gaa gea tta ate aaa gag ctg 912 
Tyr Ser Leu Arg Asn Lys Asn Val Lys Glu Ala Leu lie Lys Glu Leu 

290 295 300 

tea atg aag ata tac ttt tet taa 936 
Ser Met Lys lie Tyr Phe Ser * 
305 310 



<210> 11 

<211> 180 

<212> DNA 

<213> Homo sapiens 



<400> 11 

atg tec aga aga aac tat 
Met Ser Arg Arg Asn Tyr 
1 5 
eta aca age cgt cca gag 
Leu Thr Ser Arg Pro Glu 
20 

ttt gte tac ata gee act 
Phe Val Tyr lie Ala Thr 
35 

ate aaa gtt gat tet ega 
lie Lys Val Asp Ser Arg 



act gaa ctg aca gaa ttt 
Thr Glu Leu Thr Glu Phe 
10 

ctg ega gtt get ttc ttg 
Leu Arg Val Ala Phe Leu 
25 

gtg gta gga aac ttg ggg 
Val Val Gly Asn Leu Gly 
40 

Ctt cac act cec atg taa 
Leu His Thr Pro Met * 



gtt etc ttg ggt 48 
Val Leu Leu Gly 

15 

gea ctg ttc ctt 96 
Ala Leu Phe Leu 
30 

atg att att tta 144 

Met lie lie Leu 

45 

180 
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54 

50 55 60 



<210> 12 
<211> 64 
<212> PRT 

<213> Homo sapiens 
<400> 12 

Met Ser Phe Leu lie Arg Ser Asp 

1 5 
Leu Phe Leu Ser His Leu Ser Phe 
20 

Ala Thr Pro Pro Met Leu Val Asn 

35 40 
Phe Pro Leu Leu Val Ala Leu Ser 
50 55 



Ser Thr Leu His Thr Pro Met Cys 

10 15 
Val Asp Leu Tyr Tyr Ala Thr Asn 
25 30 
Phe Phe Phe Pro Arg Glu Lys Pro 
45 

Asn Phe Thr Phe Ser Leu His Trp 
60 



<210> 13 
<211> 315 
<212> PRT 

<213> Homo sapiens 



<400> 13 



Met 


Phe 


Ser 


Pro 


Asn 


His 


Thr 


He 


Val 


Thr 


Glu 


Phe 


He 


Leu 


Leu 


Gly 


1 








5 










10 










15 




Leu 


Thr 


Asp 


Asp 


Pro 


Val 


Leu 


Glu 


Lys 


He 


Leu 


Phe 


Gly Val 


Phe 


Leu 








20 










25 










30 






Ala 


He 


Tyr 


Leu 


He 


Thr 


Leu 


Ala 


Gly Asn 


Leu 


Cys 


Met 


He 


Leu 


Leu 






35 










40 










45 








He 


Arg 


Thr 


Asn 


Ser 


His 


Leu 


Gin 


Thr 


Pro 


Met 


Tyr 


Phe 


Phe 


Leu 


Gly 




50 










55 










60 










His 


Leu 


Ser 


Phe 


Val 


Asp 


He 


Cys 


Tyr 


Ser 


Ser 


Asn 


Val 


Thr 


Pro 


Asn 


65 










70 










75 










80 


Met 


Leu 


His 


Asn 


Phe 


Leu 


Ser 


Glu 


Gin 


Lys 


Thr 


He 


Ser 


Tyr 


Ala 


Gly 










85 










90 










95 




Cys 


Phe 


Thr 


Gin 


Cys 


Leu 


Leu 


Phe 


He 


Ala 


Leu 


Val 


He 


Thr 


Glu 


Phe 








100 










105 










110 






Tyr 


Phe 


Leu 


Ala 


Ser 


Met 


Ala 


Leu 


Asp 


Arg 


Tyr 


Val 


Ala 


He 


Cys 


Ser 






115 










12 0 










125 








Pro 


Leu 


His 


Tyr 


Ser 


Ser 


Arg 


Met 


Ser 


Lys 


Asn 


He 


Cys 


He 


Ser 


Leu 




130 










135 










14 0 










Val 


Thr 


Val 


Pro 


Tyr 


Met 


Tyr 


Gly 


Phe 


Leu 


Asn 


Gly 


Leu 


Ser 


Gin 


Thr 


145 










150 










155 










160 


Leu 


Leu 


Thr 


Phe 


His 


Leu 


Ser 


Phe 


Cys 


Gly 


Ser 


Leu 


Glu 


He 


Asn 


His 










165 










170 










175 




Phe 


Tyr 


Cys 


Ala 


Asp 


Pro 


Pro 


Leu 


He 


Met 


Leu 


Ala 


Cys 


Ser 


Asp 


Thr 








180 










185 










190 






Arg 


Val 


Lys 


Lys 


Met 


Ala 


Met 


Phe 


Val 


Val 


Ala 


Gly 


Phe 


Thr 


Leu 


Ser 






195 










200 










205 








Ser 


Ser 


Leu 


Phe 


He 


He 


Leu 


Leu 


Ser 


Tyr 


Leu 


Phe 


He 


Phe 


Ala 


Ala 




210 










215 










220 










He 


Phe 


Arg 


He 


Arg 


Ser 


Ala 


Glu 


Gly Arg 


His 


Lys 


Ala 


Phe 


Ser 


Thr 


225 










230 










235 










240 


Cys 


Ala 


Ser 


His 


Leu 


Thr 


He 


Val 


Thr 


Leu 


Phe 


Tyr 


Gly 


Thr 


Leu 


Phe 










245 










250 










255 




Cys 


Met 


Tyr 


Val 


Arg 


Pro 


Pro 


Ser 


Glu 


Lys 


Ser 


Val 


Glu 


Glu 


Ser 


Lys 








260 










265 










270 






He 


Thr 


Ala 


Val 


Phe 


Tyr 


Thr 


Phe 


Leu 


Thr 


Pro 


Met 


Leu 


Asn 


Pro 


Leu 






275 










280 










285 








He 


Tyr 


Ser 


Leu 


Arg 


Asn 


Thr 


Asp 


Val 


He 


Leu 


Ala 


Met 


Gin 


Gin 


Met 




290 










295 










300 










He 


Arg 


Gly 


Lys 


Ser 


Phe 


His 


Lys 


He 


Ala 


Val 













305 310 315 
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<210> 14 

<211> 172 

<212> PRT 

<213> Homo sapiens 



<400> 14 




























Met 


Leu 


Lys 


Lys 


Asn 


His 


Thr 


Ala 


Val 


Thr 


Glu 


Phe 


Val 


Leu 


Leu 


Gly 


1 








5 










10 










15 




Leu 


Thr 


Asp 


Arg 


Ala 


Glu 


Leu 


Gin 


Ser 


Leu 


Leu 


Phe 


Val 


Val 


Phe 


Leu 








20 










25 










30 






Val 


He 


Tyr 


Leu 


He 


Thr 


Val 


He 


Gly Asn 


Val 


Ser 


Met 


He 


Leu 


Leu 






35 










40 










45 








He 


Arg 


Ser Asp 


Ser 


Thr 


Leu 


His 


Thr 


Pro 


Met 


Tyr 


Phe 


Phe 


Leu 


Ser 




50 










55 










60 










His 


Leu 


Ser 


Phe 


Val 


Asp 


Leu 


Cvs 


Tyr 


Thr 


Thr 


Asn 


Val 


Thr 


Pro 


Gin 


65 










70 










75 










80 


Met 


Leu 


Val 


Asn 


Phe 


Leu 


Ser 


Lys 


Arg 


Lys 


Thr 


He 


Ser 


Phe 


He 


Gly 










85 










90 










95 




Cys 


Phe 


He 


Gin 


Phe 


His 


Phe 


Phe 


He 


Ala 


Leu 


Val 


He 


Thr 


Ast) 


Tvr 

a. jr u. 








100 










105 










110 






Tyr 


Met 


Leu 


Thr 


Val 


Met 


Ala 


Tyr 


Asp 


Arg 


Tyr 


Met 


Ala 


He 


Cys 


Lys 






115 










120 










125 








Pro 


Leu 


Leu 


Tyr 


Gly 


Ser 


Lys 


Met 


Thr 


Arg 


Cys 


Val 


Cys 


Leu 


Cys 


Leu 




130 










135 










140 










Ala 


Ala 


Ala 


Pro 


Tyr 


J — L 


iy J. 


Gly 


Phe 


Ala 


Asn 


Gly 


Leu 


Ser 


Thr 


Asp 


145 










150 










155 










160 


His 


Pro 


Asp 


Ala 


Ser 


O C J. 


Val 


Leu 


Leu 


Trp 


Thr 


Gin 


















165 










170 














<210> 15 




























<211> 315 




























<212> PRT 




























<213> Homo sapiens 
























<400> 15 




























Met 


Leu 


Ser 


Pro 


Asn 


His 


Thr 


He 


Val 


Thr 


Glu 


Phe 


He 


Leu 


Leu 


Gly 


1 








5 










10 










15 




Leu 


Thr 


Asp 


Asp 


Pro 


Val 


Leu 


Glu 


Lys 


He 


Leu 


Phe 


Gly 


Val 


Phe 


Leu 








20 










25 










30 






Ala 


He 


Tyr 


Leu 


He 


Thr 


Leu 


Ala 


Gly Asn 


Leu 


Cys 


Met 


He 


Leu 


Leu 






35 










40 










45 








He 


Arg 


Thr 


Asn 


Ser 


Gin 


Leu 


Gin 


Thr 


Pro 


Met 


Tyr 


Phe 


Phe 


Leu 


Gly 




50 










55 










60 










His 


Leu 


Ser 


Phe 


Leu 


Asp 


He 


Cys 


Tyr 


Ser 


Ser 


Asn 


Val 


Thr 


Pro 


Asn 


65 










70 










75 










80 


Met 


Leu 


His 


Asn 


Phe 


Leu 


Ser 


Glu 


Gin 


Lys 


Thr 


He 


Ser 


Tyr 


Ala 


Gly 










85 










90 










95 




Cys 


Phe 


Thr 


Gin 


Cys 


Leu 


Leu 


Phe 


He 


Ala 


Leu 


Val 


He 


Thr 


Glu 


Phe 








100 










105 










110 






Tyr 


Phe 


Leu 


Ala 


Ser 


Met 


Ala 


Leu 


Asp 


Arg 


Tyr 


Val 


Ala 


He 


Cvs 


Ser 






115 










120 










125 








Pro 


Leu 


His 


Tyr 


Ser 


Ser 


Arg 


Met 


Ser 


Lys 


Asn 


He 


Cys 


He 


Ser 


Leu 




130 










135 










140 










Val 


Thr 


Val 


Pro 


Tyr 


Met 


Tyr 


Gly 


Phe 


Leu 


Asn 


Gly 


Leu 


Ser 


Gin 


Thr 


145 










150 










155 










160 


Leu 


Leu 


Thr 


Phe 


His 


Leu 


Ser 


Phe 


Cys 


Gly 


Ser 


Leu 


Glu 


He 


Asn 


His 










165 










170 










175 




Phe 


Tyr 


Cys 


Ala 


Asp 


Pro 


Pro 


Leu 


He 


Met 


Leu 


Ala 


Cys 


Ser 


Asp 


Thr 








180 










185 










190 






Arg 


Val 


Lys 


Lys 


Met 


Ala 


Met 


Phe 


Val 


Val 


Ala 


Gly 


Phe 


Thr 


Leu 


Ser 






195 










200 










205 








Ser 


Ser 


Leu 


Phe 


He 


He 


Leu 


Leu 


Ser 


Tyr 


Leu 


Phe 


He 


Phe 


Ala 


Ala 
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210 










215 










220 










lie 


Phe 


Arg 


He 


Arg 


Ser 


Ala 


Glu 


Gly Arg 


His 


Lys 


Ala 


Phe 


Ser 


Thr 


225 










230 










235 










240 


Cys 


Ala 


Ser 


His 


Leu 


Thr 


He 


Val 


Thr 


Leu 


Phe 


Tyr 


Gly 


Thr 


Leu 


Phe 










245 










250 










255 




Cys 


Met 


Tyr 


Val 


Arg 


Pro 


Pro 


Ser 


Glu 


Lys 


Ser 


Val 


Glu 


Glu 


Ser 


Lys 








260 










265 










270 






He 


He 


Ala 


Val 


Phe 


Tyr 


Thr 


Phe 


Leu 


Ser 


Pro 


Met 


Leu 


Asn 


Pro 


Leu 






275 










280 










285 








He 


Tyr 


Ser 


Leu 


Arg 


Asn 


Arg 


Asp 


Val 


He 


Leu 


Ala 


He 


Gin 


Gin 


Met 




290 










295 










300 










He 


Arg 


Gly Lys 


Ser 


Phe 


Cys 


Lys 


He 


Ala 


Val 













305 310 315 



<210> 16 
<211> 305 
<212> PRT 

<213> Homo sapiens 



<400> 16 



Met 


Ser 


Asn 


Thr 


1 








Leu 


Thr 


Asp 


Cys 








20 


Val 


Val 


Tyr 


Leu 






35 




Met 


Arg 


Leu 


Asp 




50 






Asn 


Leu 


Ala 


Phe 


65 








Met 


Ser 


Thr 


Asn 


Phe 


Thr 


Gin 


Cys 








100 


Met 


Leu 


Ala 


Ala 






115 




Leu 


Arg 


Tyr 


Ser 




130 






Thr 


Phe 


Pro 


Tyr 


145 








Leu 


Thr 


Phe 


Arg 


Tyr 


Cys 


Ala 


Asp 








180 


Val 


Lys 


Glu 


His 






195 




Ser 


Leu 


Thr 


He 




210 






Leu 


Arg 


He 


Lys 


225 








Gly 


Ser 


His 


Met 


Met 


Tyr 


He 


Arg 








260 


He 


Ala 


Val 


Phe 






275 




Tyr 


Ser 


Leu 


Arg 




290 







Arg 



Asn Gly Ser Ala 

5 

Pro Glu Leu Gin 

Val Thr Leu Leu 
40 

Ser Arg Leu His 
55 

Val Asp Leu Cys 
70 

He Val Ser Glu 
85 

Tyr He Phe He 

Met Ala Tyr Asp 
120 

Val Lys Thr Ser 
135 

Val Tyr Gly Phe 
150 

Leu Thr Phe Cys 
165 

Pro Pro Leu He 

Ala Met Phe He 
200 

Val Leu Val Ser 
215 

Ser Ala Glu Gly 
230 

Met Ala Val Thr 
245 

Pro Pro Thr Asp 

Tyr Thr Phe Val 
280 

Asn Lys Asp Val 
295 



He 


Thr 


Glu 


Phe 




10 






Ser 


Leu 


Leu 


Phe 


25 








Gly Asn 


Leu 


Gly 


Thr 


Pro 


Met 


Tyr 








60 


Tyr 


Thr 


Ser 


Asn 






75 




Lys 


Thr 


He 


Ser 




90 






Ala 


Leu 


Leu 


Leu 


105 








Arg 


Tyr 


Val 


Ala 


Arg 


Arg 


Val 


Cys 








14 0 


Ser 


Asp 


Gly 


Leu 






155 




Arg 


Ser 


Asn 


Val 




170 






Lys 


Leu 


Ser 


Cys 


185 








Ser 


Ala 


Gly 


Phe 


Tyr 


Ala 


Phe 


He 








220 


Arg 


His 


Lys 


Ala 






235 




Leu 


Phe 


Tyr 


Gly 




250 






Lys 


Thr 


Val 


Glu 


265 








Ser 


Pro 


Val 


Leu 


Lys 


Gin 


Ala 


Leu 



He Leu Leu Gly 

15 

Val Leu Phe Leu 
30 

Met He Met Leu 
45 

Phe Phe Leu Thr 

Ala Thr Pro Gin 
80 

Phe Ala Gly Cys 

95 

Thr Glu Phe Tyr 
110 

He Tyr Asp Pro 
125 

He Cys Leu Ala 

Phe Gin Ala He 
160 

He Asn His Phe 
175 

Ser Asp Thr Tyr 

190 

Asn Leu Ser Ser 
205 

Leu Ala Ala He 

Phe Ser Thr Cys 
240 

Thr Leu Phe Cys 
255 

Glu Ser Lys He 
270 

Asn Pro Leu He 
285 

Lys Asn Val Leu 



<210> 17 
<211> 203 
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<212> PRT 

<213> Homo sapiens 



<400> 17 



Met 


Val 


Arg 


Gly 


Asn 


Ser 


Thr 


Leu 


1 








5 








Leu 


Lys 


Asp 


Leu 


Pro 


Glu 


Leu 


Gin 








20 










Leu 


He 


Tyr 


Leu 


He 


Thr 


Val 


Gly 






35 










40 


He 


Arg 


He 


Asp 


Ser 


Arg 


Leu 


His 




50 










55 




Ser 


Leu 


Ser 


Cys 


Leu 


Asp 


Leu 


Tyr 


65 










70 






Met 


Leu 


Val 


Asn 


Phe 


Phe 


Ser 


Asp 










85 








Cys 


Leu 


Val 


Gin 


Cys 


Tyr 


Phe 


Phe 








100 










Tyr 


Met 


Leu 


Ala 


Val 


Met 


Ala 


Tyr 






115 










120 


Pro 


Leu 


Leu 


Tyr 


Ser 


Ser 


Lys 


Met 




130 










135 




He 


Ala 


Gly 


Pro 


Tyr 


Val 


Tyr 


Gly 


145 










150 






Met 


Trp 


Thr 


Tyr 


His 


Leu 


Thr 


Phe 










165 








Phe 


Tyr 


Cys 


Ala 


Asp 


Pro 


Pro 


Leu 








180 










Phe 


He 


Lys 


Glu 


Thr 


Ser 


Met 


Phe 






195 










200 



<210> 18 

<211> 268 

<212> PRT 

<213> Homo sapiens 

<400> 18 



Leu 


Pro 


Ser 


Ser 


Arg 


Pro 


Thr 


Pro 


1 








5 








Phe 


Leu 


Ser 


Asn 


Leu 


Ser 


Phe 


Val 








20 










Thr 


Pro 


Arg 


Met 


Leu 


Glu 


He 


Phe 






35 










40 


Tyr 


Pro 


Ala 


Arg 


Leu 


Val 


Gin 


Cys 




50 










55 




Val 


Glu 


Leu 


Tyr 


He 


Leu 


Ala 


Val 


65 










70 






He 


Cys 


Asn 


Pro 


Leu 


Leu 


Tyr 


Gly 










85 








Ser 


Phe 


Leu 


He 


Thr 


Val 


Leu 


Tyr 








100 










Met 


Glu 


Thr 


Met 


Trp 


Thr 


Tyr 


Asn 






115 










120 


He 


Asn 


His 


Phe 


Tyr 


Cys 


Val 


Asp 




130 










135 




Ser 


Asp 


Thr 


Tyr 


Asn 


Lys 


Glu 


Val 


145 










150 






Asn 


Phe 


Thr 


Tyr 


Pro 


Leu 


Leu 


He 










165 








Phe 


Pro 


Ala 


Thr 


Leu 


Arg 


He 


Cys 








180 










Phe 


Ser 


Thr 


Cys 


Gly 


Ser 


His 


Leu 
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Val 


Thr 


Glu 


Phe 


He 


Leu 


Leu 


Gly 




10 










15 




Pro 


He 


Leu 


Phe 


Val 


Leu 


Phe 


Leu 


25 










30 






Gly Asn 


Leu Gly Met 


Leu 


Val 


Leu 










45 








Thr 


Pro 


Met 


Tyr 


Phe 


Phe 


Leu 


Ala 








60 










Tyr 


Ser 


Thr 


Asn 


Val 


Thr 


Pro 


Lys 






75 










80 


Lys 


Lys 


Ala 


He 


Ser 


Tyr 


Ala 


Ala 




90 










95 




He 


Ala 


Val 


Val 


He 


Thr 


Glu 


Tyr 


105 










110 






Asp 


Arg 


Tyr 


Val 


Ala 


He 


Cys 


Asn 










125 








Ser 


Lys 


Gly 


Leu 


Cys 


He 


Arg 


Leu 








140 










Phe 


Leu 


Ser 


Gly 


Leu 


Met 


Glu 


Thr 






155 










160 


Cys 


Gly 


Ser 


Asn 


He 


He 


Asn 


His 




170 










175 




He 


Arg 


Leu 


Ser 


Cys 


Ser 


Asp 


Thr 


185 










190 






Val 


Val 


Ala 












Arg 


Leu 


His 


Thr 


Pro 


Met 


Tyr 


Phe 




10 










15 




Asp 


Leu 


Cys 


Phe 


Ser 


Ser 


Asn 


Val 


25 










30 






Leu 


Ser 


Glu 


Lys 


Lys 


Ser 


He 


Ser 










45 








Tyr 


Leu 


Phe 


He 


Thr 


Leu 


Val 


His 








60 










Met 


Ala 


Phe 


Asp 


Arg 


Tyr 


Met 


Ala 






75 










80 


Ser 


Arg 


Met 


Ser 


Lys 


Ser 


Val 


Cys 




90 










95 




Val 


Tyr 


Gly 


Ala 


Leu 


Thr 


Gly 


Leu 


105 










110 






Leu 


Ala 


Phe 


Cys 


Gly 


Pro 


Ser 


Glu 










125 








Pro 


Pro 


Leu 


He 


Lys 


Leu 


Ala 


Cys 








14 0 










Ser 


Met 


Phe 


Val 


Val 


Ala 


Gly 


Phe 






155 










160 


He 


Leu 


He 


Ser 


Tyr 


Leu 


Tyr 


He 




170 










175 




Ser 


Thr 


Glu 


Gly Arg 


His 


Lys 


Ala 


185 










190 






Thr 


Ala 


Val 


Thr 


He 


Phe 


Tyr 


Ser 










205 
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58 



Ala 


Leu 


Phe 


Phe 


Met 


Tyr 


Leu 


Arg 


Arg 


Pro 


Ser 


Glu 


Glu 


Ser 


Met 


Glu 




210 










215 










220 














Lys 




V dJ. 


Ala 


Val 


Phe 


Tvr 

J- y j- 


Thr 


Thr 


Val 


He 


Pro 


Met 


Leu 


225 










230 










235 










240 


Asn 


Pro 


Met 


He 


Tyr 


Ser 


Leu 


Arg 


Asn 


Lys 


Asp 


Val 


Lys 


Glu 


Ala 


Leu 










245 










250 










255 




Cys 


Lys 


Glu 


Leu 


Phe 


Lys 


Arg 


Lys 


Leu 


Phe 


Ser 


Lys 
















260 










265 
















<210> 19 




























<211> 120 




























<212> PRT 




























<213> Homo sapiens 
























<400> 19 




























Met 


Arg 


Arg 


Asn 


T->"l_l A 

Pne 


Thr 


Leu 


Val 


Thr 


Glu 


Phe 


He 


Leu 


Leu 


Gly 


Leu 


1 








5 










10 










15 




Thr 


Asn 


His 


Gin 


Glu 


Leu 


Gin 


He 


Leu 


Leu 


Phe 


Met 


Leu 


Phe 


Leu 


Ala 








20 










25 










30 






lie 


Tyr 


Met 


Val 


Thr 


Val 


Ala 


Gly 


Asn 


Leu 


Ser 


Met 


He 


Ala 


Leu 


He 




35 










40 










45 








Gin 


Ala 


Asn 


Ala 


Arg 


Leu 


His 


Thr 


Pro 


Met 


Tyr 


Phe 


Phe 


Leu 


Ser 


His 




50 










55 










60 










Leu 


Ser 


Phe 


Leu 


Asp 


Leu 


Cys 


Phe 


Ser 


Ser 


Asn 


Val 


Thr 


Pro 


Lys 


Met 


65 










70 










75 










80 


Leu 


Glu 


He 


Phe 


Leu 


OC J- 


Glu 


Lys 


Lys 


Ser 


He 


Ser 


TVT 


Pro 


Ala 


Cys 










85 










90 










95 




Leu 


Val 


Gin 


Cys 


Tyr 


Leu 


Tyr 


He 


He 


Leu 


Val 


His 


Val 


Glu 


He 


Tyr 








100 










105 










110 






lie 


Leu 


Ala 


Val 


Met 


Ala 


Phe 


Asp 






















115 










120 


















<210> 20 




























<211> 311 




























<212> PRT 




























<213> Homo sapiens 
























<400> 20 




























Met 


Arg 


Arg 


Asn 


Cys 


Thr 


Leu 


Val 


Thr 


Glu 


Phe 


He 


Leu 


Leu 


Gly 


Leu 


1 








5 










10 










15 




Thr 


Ser 


Arg 


Arg 


Glu 


Leu 


Gin 


He 


Leu 


Leu 


Phe 


Thr 


Leu 


Phe 


Leu 


Ala 








20 










25 










30 






He 


Tyr 


Met 


Val 


Thr 


Val 


Ala 


Gly 


Asn 


Leu 


Gly 


Met 


He 


Val 


Leu 


He 






35 










40 










45 








Gin 


Ala 


Asn 


Ala 


Trp 


Leu 


His 


Met 


Pro 


Met 


Tyr 


Phe 


Phe 


Leu 


Ser 


His 




50 










55 










60 










Leu 


Ser 


Phe 


Val 


Asp 


Leu 


Cys 


Phe 


Ser 


Ser 


Asn 


Val 


Thr 


Pro 


Lys 


Met 


65 










70 










75 










80 


Leu 


Glu 


He 


Phe 


Leu 


Ser 


Glu 


Lys 


Lys 


Ser 


He 


Ser 


Tyr 


Pro 


Ala 


Cys 










85 










90 










95 




Leu 


Val 


Gin 


Cys 


Tyr 


Leu 


Phe 


He 


Ala 


Leu 


Val 


His 


Val 


Glu 


He 


Tyr 








100 










105 










110 






He 


Leu 


Ala 


Val 


Met 


Ala 


Phe 


Asp 


Arg 


Tyr 


Met 


Ala 


He 


Cys 


Asn 


Pro 






115 










120 










125 








Leu 


Leu 


Tyr 


Gly 


Ser 


Arg 


Met 


Ser 


Lys 


Ser 


Val 


Cys 


Ser 


Phe 


Leu 


He 




130 










135 










140 










Thr 


Val 


Pro 


Tyr 


Val 


Tyr Gly Ala 


Leu 


Thr 


Gly 


Leu 


Met 


Glu 


Thr 


Met 


145 










150 










155 










160 


Trp 


Thr 


Tyr 


Asn 


Leu 


Ala 


Phe 


Cys 


Gly 


Pro 


Asn 


Glu 


He 


Asn 


His 


Phe 










165 










170 










175 




Tyr 


Cys 


Ala 


Asp 


Pro 


Pro 


Leu 


He 


Lys 


Leu 


Ala 


Cys 


Ser 


Asp 


Thr 


Tyr 








180 










185 










190 






Asn 


Lys 


Glu 


Leu 


Ser 


Met 


Phe 


He 


Val 


Ala 


Gly 


Trp 


Asn 


Leu 


Ser 


Phe 
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195 200 205 

Ser Leu Phe lie lie Cys lie Ser Tyr Leu Tyr lie Phe Pro Ala lie 

210 215 220 

Leu Lys lie Arg Ser Thr Glu Gly Arg Gin Lys Ala Phe Ser Thr Cys 
225 230 235 240 

Gly Ser His Leu Thr Ala Val Thr lie Phe Tyr Ala Thr Leu Phe Phe 

245 250 255 

Met Tyr Leu Arg Pro Pro Ser Lys Glu Ser Val Glu Gin Gly Lys Met 

260 265 270 

Val Ala Val Phe Tyr Thr Thr Val lie Pro Met Leu Asn Leu lie lie 

275 280 285 

Tyr Ser Leu Arg Asn Lys Asn Val Lys Glu Ala Leu lie Lys Glu Leu 

290 295 300 

Ser Met Lys lie Tyr Phe Ser 
305 310 

<210> 21 

<211> 59 

<212> PRT 

<213> Homo sapiens 

<400> 21 

Met Ser Arg Arg Asn Tyr Thr Glu Leu Thr Glu Phe Val Leu Leu Gly 

15 10 15 

Leu Thr Ser Arg Pro Glu Leu Arg Val Ala Phe Leu Ala Leu Phe Leu 

20 25 30 

Phe Val Tyr lie Ala Thr Val Val Gly Asn Leu Gly Met lie lie Leu 

35 40 45 

lie Lys Val Asp Ser Arg Leu His Thr Pro Met 
50 55 

<210> 22 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<400> 22 

cctggagggt ttcaaaggct gatactttag 3 0 

<210> 23 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<400> 23 

ctccagcctg agcaacagag caatac 26 

<210> 24 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<400> 24 

ctcacattca ttgttcttca cagacccagc 30 

<210> 25 
<211> 24 
<212> DNA 

<213> Artificial Sequence 



<400> 25 

ccctgctggg atctggatca agac 



24 
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<210> 26 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> sequencing oligonucleotide PrimerPU 
<400> 26 

tgtaaaacga cggccagt 

<210> 27 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> sequencing oligonucleotide PrimerRP 
<400> 27 

caggaaacag ctatgacc 



